Let’s design a Currency
type in Rust. We’ll iterate through several versions, starting with the ubiquitous str
, and finishing with a stack-allocated type built with Rust’s new const generics.
The full code for this post is available here.
V0: Not even going to bother
To begin with, let’s not even bother with type-safety. After all, everything’s a string or byte array if you go down far enough in the stack.
#[derive(Debug)]
pub struct Event {
pub account: String,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: String,
pub narrative: String,
}
pub fn handle_event(acc: &str, net_cash: f64, cur: &str, narrative: &str) -> Event {
info!(
"Account {} {}ed with {} {}: {}",
acc,
if net_cash < 0.0 { "debit" } else { "credit" },
net_cash,
cur,
narrative
);
Event {
account: acc.into(),
net_cash,
gbp_cash: fx_convert(net_cash, cur, "GBP"),
currency: cur.into(),
narrative: narrative.into(),
}
}
pub fn fx_convert(amount: f64, _cur1: &str, _cur2: &str) -> f64 {
amount * 1.2
}
This is the code we’ll be iterating over. It’s a stub of some financial logic. The bit we’re interested in is the handling of the currency values: the currency
field of Event
, the cur
parameter of handle_event
, and the two cur[12]
parameters of fx_convert
.
The two aspects we want to improve on are type-safety and ergonomics. The code above isn’t very type-safe because it’s trivial to use incorrect &str
values as currencies. For example, the intended use of the function is obvious if you look at its implementation, but imagine running into this call in the wild: handle_event("EUR", 100.0, "ACC1", "Credit 100€")
. It compiles, and it’s wrong, but it doesn’t look wrong.
Ergonomics-wise, using &str
everywhere is pretty ok, but it’s going to get worse when we introduce strong types, and then it’s going to get better again as we iterate.
A bad solution: using an enum
The first thing we might do is consider the problem domain. There’s a finite and small number of countries, so isn’t there some fixed list of currencies? Indeed there is: ISO4217 has the list of all valid currency codes.
So, we could create an enum
of all the valid currencies and have some functions convert back and forth between &str
and Currency
:
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub enum Currency {
GBP,
EUR,
}
impl fmt::Display for Currency {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Currency::GBP => write!(f, "GBP"),
Currency::EUR => write!(f, "EUR"),
}
}
}
impl str::FromStr for Currency {
type Err = eyre::Error;
fn from_str(str: &str) -> Result<Self, Self::Err> {
match str {
"EUR" => Ok(Currency::EUR),
"GBP" => Ok(Currency::GBP),
_ => bail!("Unknown currency: '{str}'"),
}
}
}
#[derive(Debug)]
pub struct Event {
pub account: String,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: Currency,
pub narrative: String,
}
pub fn handle_event(acc: &str, net_cash: f64, cur: Currency, narrative: &str) -> Event {
info!(
"Account {} {}ed with {} {}: {}",
acc,
if net_cash < 0.0 { "debit" } else { "credit" },
net_cash,
cur,
narrative
);
Event {
account: acc.into(),
net_cash,
gbp_cash: fx_convert(net_cash, cur, Currency::GBP),
currency: cur,
narrative: narrative.into(),
}
}
pub fn fx_convert(amount: f64, _cur1: Currency, _cur2: Currency) -> f64 {
amount * 1.2
}
This is pretty great in all regards, except that it blows up on unknown currencies. We could list all the ISO4217 currency codes, or we could use one of the several crates that do this, but we’d still have a problem whenever the list changed.
To illustrate, let me share a story from an old job. We had a variant type of currencies very much like the above, and everything was great until Venezuela changed its currency on short notice. In 2018, they replaced VEF
with VES
. Of course, we had to support the new currency in a hurry because of business reasons, but when we looked, there was a huge number of production systems that had been rolled linking to the library with the hard-coded currency codes. My team was familiar with some of the systems, and we could roll them on short notice, but most of the list was random apps, some of which didn’t even have listed maintainers and hadn’t been rolled in years. Worse, we couldn’t even tell which of the apps were actually affected by the change in the currency list because some only depended on the library indirectly, and not all would be expected to see the new currency in practice. Ultimately, we identified a few key systems that absolutely had to support the new currency, rolled them, and let everyone else handle errors in their apps whenever they popped up. This was clearly suboptimal, but it was good enough… until a couple of years later when we had to support currencies like Tether (USDT
)—again for business reasons—and then the floodgates were open for random strings as currencies.
The core issue here is that “currency” is a real-world concept, and our code has to conform. We can’t just hardcode a list of currencies, and hope the world agrees and never changes. Our currency type must be future proof and has to support everything that will get thrown at it.
There is an obvious workaround to the problem with the
enum
: we could add an Other(String)
variant. This way, when we encounter a currency that we hadn’t hardcoded a variant for, we could just shove it into Other
. Practically, this just makes our type a newtype wrapper around String
with extra complexity around comparisons. It’s not good.
V0.5: Type alias
So, we’re stuck with using some dynamic type like a String
. The first thing we can do is define a type alias:
type Currency = String;
pub struct Event {
...
pub currency: Currency,
...
pub fn handle_event(acc: &str, net_cash: f64, cur: &Currency, narrative: &str) -> Event {
...
This helps with documenting the struct and function signature, but doesn’t do anything for type-safety since we can still mix-up any String
in place of a Currency
.
V1: Newtype wrapper
The construct that actually improves type-safety is a newtype.
I think the word “newtype” comes from Haskell where it’s an actual syntactic construct. At least, that’s why I’ve always called the pattern this.
Practically, we define a struct
with a single String
field. Since this is a completely new type, we have to re-implement all the traits for it. Some we can derive, and some we have to write ourselves. There are crates like aliri_braid
that automate the boilerplate, but we’ll write it out manually to see what’s going on.
#[derive(Clone, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct Currency(String);
impl fmt::Display for Currency {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0)
}
}
impl From<&str> for Currency {
fn from(str: &str) -> Self {
Self(str.to_string())
}
}
// More impls here: Serialize/Deserialize, ToSql/FromSql,
// From<String>, From<&String>, etc.
#[derive(Debug)]
pub struct Event {
pub account: String,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: Currency,
pub narrative: String,
}
pub fn handle_event(acc: &str, net_cash: f64, cur: &Currency, narrative: &str) -> Event {
info!(
"Account {} {}ed with {} {}: {}",
acc,
if net_cash < 0.0 { "debit" } else { "credit" },
net_cash,
cur,
narrative
);
Event {
account: acc.into(),
net_cash,
gbp_cash: fx_convert(net_cash, cur, &"GBP".into()),
currency: cur.clone(),
narrative: narrative.into(),
}
}
pub fn fx_convert(amount: f64, _cur1: &Currency, _cur2: &Currency) -> f64 {
amount * 1.2
}
This is pretty good for type-safety: we can no longer pass an account string in place of a currency, and we have to explicitly turn strings into currencies with .into()
. It’s a bit of a pain to use, though. Having to call clone()
whenever we move a currency value adds noise to the code, and having to write &"GBP".into()
every time we have a special case gets old very quickly.
If all we cared about was type-safety, this blogpost could end here, but I don’t think that’s enough. In my experience, if code is ugly to read and annoying to write, then mistakes creep in. They might not be typing mistakes, but that’s little solace after shipping a buggy program.
For instance, imagine code like the following. It uses our Currency
for currencies and BigDecimal
for numbers, both types requiring explicit references and clones.
let ev = FxEvent {
cur1_amount: cur1_amount.clone(),
cur2_amount: &cur1_amount / fx_rate(&cur1, &cur2),
cur1: cur1.clone(),
cur2: cur1.clone(),
}
The bug is easy to spot in this 5 line snippet, but imagine if this was in the middle of 1000 lines of convoluted business logic.
If we didn’t have all the clones and references, the code would be the following, and it would be basically impossible to get wrong:
let ev = FxEvent {
cur1_amount,
cur2_amount: cur1_amount / fx_rate(cur1, cur2),
cur1,
cur2,
}
Let’s look at the two problems one at a time. What’s going on with the clone
calls? Our type contains a String
value, which is heap-allocated and doesn’t implement Copy
, so Rust requires us to manually create copies of it by calling clone()
. There’s no way around this: we could change the specific way our type contains the string using an Rc<_>
or a Cow<_>
or something else, but as long as the type potentially contains a heap-allocated value, it cannot have Copy
.
As for the string conversions, we’d rather define a GBP
constant once and use it everywhere instead of creating fresh "GBP"
values in lots of places. The problem is that we can’t use heap-allocated types as const
values, so no constant String
s, and no constant types that contain String
s. We could work around this with lazy_static
, but then we’d have to refer to the constant with *GBP
or &*GBP
which looks ugly.
To fix both our problems, we need to stack-allocate our type, and for that, we need to give it a fixed size known at compile time.
V1.5: Fixed size reference
A solution would be to wrap a &'a str
instead of a String
. The reference has a size known at compile time, so it can have Copy
. The problem is that we now have a lifetime to deal with, and this lifetime will “infect” any type that contains Currency<'a>
, and any function that uses it:
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct Currency<'a>(&'a str);
impl<'a> fmt::Display for Currency<'a> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0)
}
}
impl<'a> From<&'a str> for Currency<'a> {
fn from(str: &'a str) -> Self {
Self(str)
}
}
#[derive(Debug)]
pub struct Event<'a> {
pub account: String,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: Currency<'a>,
pub narrative: String,
}
pub fn handle_event<'a>(
acc: &str,
net_cash: f64,
cur: Currency<'a>,
narrative: &str,
) -> Event<'a> {
...
All the <'a>
annotations don’t look great, and if we added more types containing references, it would only get worse. Additionally, this limits what we can do with Currency<'a>
because we can only use it within the lifetime of 'a
. For instance, if we read the currency from a file or a database, we wouldn’t be able to return the value to a higher level in the program, and we wouldn’t be able to use it in a closure or a future either. While storing references like this might make sense if we were talking about lots of data, our Currency
is effectively 3 bytes, so all this awkwardness is unjustified.
We could work around this by using the 'static
lifetime instead of 'a
. The former doesn’t need to be propagated through enclosing types, so it wouldn’t make the code uglier, and it also wouldn’t complicate our borrowing story since all lifetimes are smaller than 'static
. The problem is that this is effectively the enum
solution all over again: we’d have to hardcode all the possible currencies and blow up whenever we encounter a new one.
This won’t do. A good solution needs to be fixed size, clean to write, and also support arbitrary strings.
V2: Fixed size with const generics
A newish addition to the Rust language is const generics. This essentially lets us parametrize types over primitive values. That is, before you could have Vec<T>
where T
was some type, but now you can also have Array<N>
where N
is a number. Instead of basing our newtype on String
, we’ll instead have it wrap a new type called FixedStr<N>
:
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct FixedStr<const N: usize> {
buf: [u8; N],
}
impl<const N: usize> FixedStr<N> {
pub fn as_str(&self) -> &str {
// By construction, this should never fail
str::from_utf8(&self.buf[..]).expect("invalid utf8 in FixedStr")
}
}
impl<const N: usize> fmt::Display for FixedStr<N> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.as_str())
}
}
impl<const N: usize> str::FromStr for FixedStr<N> {
type Err = eyre::Error;
fn from_str(str: &str) -> Result<Self, Self::Err> {
// This fails if `str` isn't exactly `N` bytes long.
Ok(Self {
buf: str.as_bytes().try_into()?,
})
}
}
A FixedStr<N>
is a struct that contains a single field which is an array of length N
of u8
values. Our currency is going to be a [wrapper around] FixedStr<3>
.
We have to use an array instead of an str
because, unlike the latter, arrays have a size known at compile time and can be stack-allocated. But users of our type would rather deal with str
s, so we handle conversions to and from in the as_str()
function and the FromStr
implementation. Note that the conversion is fallible in both directions; we could avoid some of this because we know that the contents of the array are always valid UTF-8 by construction, but that would require using unsafe
, and that would muddy the presentation.
The most important thing we get from this custom type is the Copy
trait. Rust knows that this type is safe to bitwise copy, so we don’t have to ever manually clone
it.
With our FixedStr<N>
type in hand, we define our newtype:
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct Currency(FixedStr<3>);
impl fmt::Display for Currency {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0.as_str())
}
}
impl str::FromStr for Currency {
type Err = eyre::Error;
fn from_str(str: &str) -> Result<Self, Self::Err> {
// This fails if `str` isn't exactly 3 bytes long.
Ok(Self(
str.parse().wrap_err("converting '{str}' into Currency")?,
))
}
}
// More impls here: Serialize/Deserialize, ToSql/FromSql,
// TryFrom<String>, TryFrom<&String>, etc.
#[derive(Debug)]
pub struct Event {
pub account: String,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: Currency,
pub narrative: String,
}
pub fn handle_event(
acc: &str,
net_cash: f64,
cur: Currency,
narrative: &str,
) -> eyre::Result<Event> {
info!(
"Account {} {}ed with {} {}: {}",
acc,
if net_cash < 0.0 { "debit" } else { "credit" },
net_cash,
cur,
narrative
);
Ok(Event {
account: acc.into(),
net_cash,
gbp_cash: fx_convert(net_cash, cur, "GBP".parse()?),
currency: cur,
narrative: narrative.into(),
})
}
pub fn fx_convert(amount: f64, _cur1: Currency, _cur2: Currency) -> f64 {
amount * 1.2
}
Since the underlying type has Copy
, so does our newtype. As a result, we no longer pass references to it around, and we no longer have clone()
calls. There’s still the fairly ugly "GBP".parse()?
, but we’ll fix that a bit later.
Also, since the underlying type has fallible conversions to and from &str
, so does our type: we now have FromStr
and TryFrom
instead of just From
.
This is all pretty good, but there’s an obvious question we’ve been avoiding. What about the other stringy types? Specifically, what about accounts? Clearly those should have their own type, but FixedStr<N>
won’t work. Account strings are usually assigned by banks, brokers, and clearing firms and they don’t all have the same length. That said, we can reasonably assume they have some max length, so let’s add support for that to our type.
V3: Max size with const generics
We replace FixedStr<N>
with StackStr<N>
. The big new thing is the len: u8
field. When we store a string shorter than N
, we’ll put its length in len
. I’m choosing to make the field a u8
because Rust is going to be copying values of this type around willy-nilly, so it really shouldn’t be too big. Despite this, the const generics parameter still has to be of type usize
because the size of the array must always be of type usize
. We’ll just have to do some runtime checks to enforce the maximum size.
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct StackStr<const N: usize> {
len: u8,
buf: [u8; N],
}
impl<const N: usize> StackStr<N> {
pub fn as_str(&self) -> &str {
// By construction, this should never fail
str::from_utf8(&self.buf[..self.len as usize]).expect("invalid utf8 in FixedStr")
}
}
impl<const N: usize> fmt::Display for StackStr<N> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.as_str())
}
}
impl<const N: usize> str::FromStr for StackStr<N> {
type Err = eyre::Error;
fn from_str(str: &str) -> Result<Self, Self::Err> {
let bytes = str.as_bytes();
let len = bytes.len();
if len > u8::MAX as usize {
bail!("StackStr can be at most {} big", u8::MAX);
}
if len >= N {
bail!("String '{str}' does not fit in StackStr<{N}>");
}
let mut buf = [0; N];
buf.as_mut_slice()[0..len].copy_from_slice(bytes);
Ok(Self {
len: len as u8,
buf,
})
}
}
The small change is in as_str()
. Instead of returning &self.buf[..]
as UTF-8, we instead return &self.buf[..self.len as usize]
.
The big change is in FromStr
. It’s fundamentally the same as before, but with added checks that the maximum length isn’t greater than 256, and that the given string doesn’t exceed the maximum length from the type.
The StackStr<N>
struct is packed by default and takes N+1 bytes of memory.
With the underlying type done, we just need to define our Currency
newtype. It’s just the same code as before. And in order to make an Account
newtype, we just have to copy-paste that code again… let’s automate the boilerplate generation.
Generalizing with macro_rules!
A very simple way to generate boilerplate in Rust is with macro_rules!
. For the most part, we just write the code we want the macro to generate with holes that are filled in by the macro arguments. It’s very much like cpp
macros, except there’s no chance of syntax tokens getting combined in weird ways, and it’s easy to make multi-line macros.
macro_rules! newtype {
($name:ident, $size:literal) => {
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct $name(StackStr<$size>);
impl $name {
pub const fn new_unchecked(buf: [u8; $size], len: u8) -> Self {
Self(StackStr::new_unchecked(buf, len))
}
}
impl fmt::Display for $name {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0.as_str())
}
}
impl str::FromStr for $name {
type Err = eyre::Error;
fn from_str(str: &str) -> Result<Self, Self::Err> {
// This fails if `str` is too big
Ok(Self(str.parse().wrap_err("converting '{str}' into $name")?))
}
}
};
}
newtype!(Currency, 4);
newtype!(Account, 8);
Once we have the macro, we define each newtype with a single line of code. While not the ideal way of creating newtypes—my kingdom for an Ocaml functor—it’s good enough.
Constructing const
values
The only remaining niggle is the ad-hoc Currency
creations like &"GBP".into()
. We’d rather these be constants that we could just reference with names like GBP
. Unfortunately, this is where our luck runs out. With stable Rust, the best I could come up with is this const
constructor:
impl<const N: usize> StackStr<N> {
pub const fn new_unchecked(buf: [u8; N], len: u8) -> Self {
Self { len, buf }
}
}
const GBP: Currency = Currency::new_unchecked([b'G', b'B', b'P', 0], 3);
const EUR: Currency = Currency::new_unchecked([b'E', b'U', b'R', 0], 3);
This works, but is pretty ugly, and there’s nothing to stop us from constructing an invalid value by getting the len
wrong, or by using bad UTF-8 code points in the array.
At some point in the future, the following will be possible, but we need the const_mut_refs
unstable feature to get it to compile today:
impl<const N: usize> StackStr<N> {
// Doesn't compile
pub const fn new_from_str(str: &'static str) -> Self {
let mut buf = [0; N];
unsafe {
std::ptr::copy_nonoverlapping(str.as_ptr(), &mut buf as *mut _, str.len());
}
Self {
len: str.len() as u8,
buf,
}
}
}
Something could also probably be implemented with procedural macros, but that seems like a disproportionate amount of work for what is ultimately going to be just a few lines requiring careful coding.
Conclusion
To sum up, we’ve seen how to create a type-safe and ergonomic Currency
type through the judicious use of newtypes. The “application” code looks pretty good too, and there’s no opportunity to misuse the stringy types:
#[derive(Debug)]
pub struct Event {
pub account: Account,
pub net_cash: f64,
pub gbp_cash: f64,
pub currency: Currency,
pub narrative: String,
}
pub fn handle_event(
account: Account,
net_cash: f64,
currency: Currency,
narrative: &str,
) -> eyre::Result<Event> {
info!(
"Account {} {}ed with {} {}: {}",
account,
if net_cash < 0.0 { "debit" } else { "credit" },
net_cash,
currency,
narrative
);
Ok(Event {
account,
net_cash,
gbp_cash: fx_convert(net_cash, currency, GBP),
currency,
narrative: narrative.into(),
})
}
pub fn fx_convert(amount: f64, _cur1: Currency, _cur2: Currency) -> f64 {
amount * 1.2
}
It may seem like it took a lot of work, but most of it was in the old versions. The final version is only 42 lines for StackedStr<N>
, and 26 lines for the newtype!
macro. With these ~70 lines in place, we can define more newtypes in just single lines of code, so there’s really no excuse not to promote all stringy types to actual types.
Thanks to Francesco Mazzoli for reading drafts of this post.