Be Careful Zero-Copying Strings with serde

When deserializing a string using serde, it is possible to use a borrowed &str instead of an owned String:

use serde::Deserialize;
use serde_json;

#[derive(Deserialize)]
struct Foo<'a> {
    // This string is borrowed.
    text: &'a str,
}

fn main() {
    let json = r#"{ "text": "Hello, world!" }"#;

    let foo: Foo = serde_json::from_str(json).unwrap();

    println!("{}", foo.text); // Hello, world!
}

The borrowed string is a reference to a portion of the original serialized data. In this case, foo.text refers to a slice of the json variable that contains the text Hello, world!.

This process is called zero-copy deserialization, and can be more efficient than allocating a new String and copying the data to it. Be warned, however; some strings cannot be deserialized into &str, and must be deserialized into a String instead.

The specific case where I found this out was when I was deserializing text with backslashes in it:

let json = r#"{ "text": "Go to C:\Users\bd\Desktop" }"#;

let foo: Foo = serde_json::from_str(json).unwrap();

println!("{}", foo.text);

Instead of printing Go to C:\Users\bd\Desktop as I expected, it instead panicked!

thread 'main' panicked at src/main.rs:12:47:
called `Result::unwrap()` on an `Err` value: Error("invalid type: string "Go to C:\\Users\\bd\\Desktop", expected a borrowed string", line: 1, column: 34)

When deserializing the text, serde_json needs to convert Go to C:\\Users\\bd\\Desktop to Go to C:\Users\bd\Desktop. The only way it can do that is by allocating a new string. serde_json can’t do that here, however, because we told it not to by using zero-copy deserialization!

In order to fix this, you need to replace the borrowed &str with an owned String. It can be slower than zero-copy deserialization, but it supports all possible data inputs:

use serde::Deserialize;
use serde_json;

#[derive(Deserialize)]
struct Foo {
    text: String,
}

fn main() {
    let json = r#"{ "text": "Go to C:\Users\bd\Desktop" }"#;

    let foo: Foo = serde_json::from_str(json).unwrap();

    println!("{}", foo.text); // Go to C:UsersdDesktop
}

This kind of issue will arise when deserializing other escape codes in JSON, such as \n and \t. It can also occur when using other types that can be zero-copied, such as &Path1. Next time you consider using zero-copy deserialization, be sure you’re ok with limiting what data you can support.

Further Reading:

Addendum

As @korrat has helpfully pointed out, Cow<str> can be used as a compromise between &str and String. If you annotate a field with #[serde(borrow)], it will first try to zero-copy deserialize the string, but will fall back to cloning the data if it needs to be modified.

As a result, Cow<str> should be preferred as it offers performance improvements without restricting what data can be deserialized:

use serde::Deserialize;
use serde_json;

use std::borrow::Cow;

#[derive(Deserialize)]
struct Foo<'a> {
    // Try to borrow the string when possible, but clone it when necessary.
    #[serde(borrow)]
    text: Cow<'a, str>,
}

fn main() {
    // No changes need to be made, the string can be borrowed.
    let json = r#"{ "text": "Hello, world!" }"#;

    let foo: Foo = serde_json::from_str(json).unwrap();

    assert!(matches!(foo.text, Cow::Borrowed(_)));

    // Changes need to be made, the string must be owned.
    let json = r#"{ "text": "Hello,\nworld!" }"#;

    let foo: Foo = serde_json::from_str(json).unwrap();

    assert!(matches!(foo.text, Cow::Owned(_)));
}

In the initial release of this post, I incorrectly wrote that Cow<str> does not support zero-copy deserialization. This isn’t the case, you just need to annotate the Cow<str> field with #[serde(borrow)]. Don’t forget that attribute, or Cow<str> will just be equivalent to String! See the serde docs for more information and examples.


  1. Be especially careful about using this type. Since it cannot deserialize backslashes, you’re essentially eliminating support for Windows paths.