Building a Rust Dictionary Program

Building a Rust Dictionary Program

Why Rust?

I've casually had my eye on Rust for quite a bit now. A couple of years ago I joined a hackathon leveraging some Web3 technologies at the height of the crypto hype, and learned about smart contracts. At the time, the chain we were building on offered two options for writing these contracts: Assemblyscript, or Rust, and this was the first time I had heard of the language. It turns out these languages were chosen for easy compilation to web assembly but I didn't know this at the time. I was two years into my software engineering degree and I had no idea what web assembly even was. All I knew was that Assemblyscript looked like Javascript (which I already knew) and my hackathon teammate had already used it to develop some contracts. So, that's what we went with, and Rust was placed on the back burner of my mind.

Moving forward that's how I primarily thought of Rust. I had never heard of it used in another context, so I assumed it was just a smart contract development language like Solidity. That is, until the last year or so. I started seeing the name pop up more and more around the internet, in discord channels, on Reddit, etc. Yet, I couldn't place it in a neat box along with the other languages I've heard mention of over the years. One person would talk about their web application, and another would bring up systems design. I know there are lots of languages used in a wide range of very distinct domains, but I'd usually hear tell of the root: web for JS, data for Python, masochism for C. I had yet to understand what Rust's purpose was, and so I was intrigued.

As I started down the rabbit hole, I heard about Rust's memory safety and promises of high-level features with low-level control. As someone who has felt somewhat stretched thin trying to develop my web development skills alongside exposing myself to more low-level development, I started to wonder if this could be an answer to some of my problems.

As such, I figured it was time to build something, and I decided to make a simple dictionary application. This post is mostly me during that process of exploration, figuring out syntax, the Rust way of doing things, and attempting to explain some of the concepts that trip me up. So, let's get started!

Setting Up My Development Environment

I'm on Linux, so to get started I downloaded the .deb file from the Rust website, and installed it on my local machine with the default settings while appreciating that this automatically installed cargo, Rust's package manager alongside. After verifying that everything was up to date, I went on to make my first my first program by just creating a file in my working directory and calling it hello_world.rs.

fn main(){
    //println macro -> source code modified at compile time here
    println!("Hello, World!")
}

Then I compiled it with the rustc hello_world.rs command and ran it with ./hello_world.rs. I had officially written my first Rust program, utilizing my first macro at that. Now that I knew everything was installed correctly, I began to think about the dictionary application I wanted to make, and how I would organize my project.

Convention

My first task was to establish some consistency for my Rust projects. How should I structure my project? What naming conventions should be used? It may seem trivial, and some developers I know hate worrying about this stuff, but I love it, and I know it will boost my ability to effectively collaborate with other developers (and avoid their pedantic comments) in the future. It also happens to be one of those things that is easier the earlier you start to care.

The full list is here but the naming conventions I mostly cared about for starting are as follows:


    - types: UpperCamelCase
    - modules: snake_case
    - Functions/Methods: snake_case
    - macros: snake_case! 
    - local variables: snake_case

Unlike with the hello world program, to set up my project directory for the dictionary app, I used cargo by running the command cargo new dictionary_app. This resulted in the following project structure:

dctionary_app
    - src
        - main.rs
    - .gitignore
    - cargo.toml

I noticed the .gitignore immediately and was very pleased to learn that cargo initializes a git repository for you on creation.

As code gets decoupled and more files are created in the src directory, they will become rust modules that can be imported from my other files. Now that everything was set up appropriately I could finally start thinking about tackling the problem at hand.

Building the Thing

Okay, how am I actually going to create a dictionary application? As simply as possible. To start, the program will provide three main functions:

  1. request a word from the user

  2. call a dictionary API with that word

  3. print the result

Task 1: Getting user input

Here is the code I came up with to get user input, with comments to denote the lines I will discuss afterwards as I understand them.

//line 1
use std::io;


fn main() {

    //Line 2
    let mut word = String::new();
    println!("Enter word: ");

    //line 3
    let input = io::stdin();

    //line 4
    input.read_line(&mut word).expect("failed to read input");
    println!("Search for: {word}");

}

Line 1

The use keyword is used to bring the io type in scope and allows me to call functions on io directly. If it weren't there I could still access the function I need but I would need to type out the whole path.

Line 2

Here I am creating a new string. You'll notice the :: syntax again. This denotes that we are calling a function on a type. In this case on String. To create a new one, I call the function new(). I also needed to use the mut keyword to make the variable word mutable (changeable), as all variables in Rust are immutable by default.

Line 3

Line 3 just creates a variable called input, and stores an instance of the stdin type, which is created in a similar way as the new String, except here we are calling the stdin() function on the io type, rather than new() on String. As mentioned earlier, I could have omitted the use statement at the top of the file and instead wrote let input = std::io::stdin() for the same result.

This line also could have been merged with line 4, but I left them separate to solidify for myself that this function returns an instance of the stdin type that then has a function called on it to generate the Result instance.

Line 4

Now in line 4, we call the function read_line() on our stdin instance, and you can notice that we use a . rather than :: to call this function for that reason. :: for types, . for instances of types.

When I first looked at this code, I noticed the funny syntax in the function call as well. What the use of & means here is that we are sending the function a reference to our word variable, not the variable itself. This prevents the program from copying the same variable in memory a second time and saves me space. Like regular variables, references are immutable by default, and since I am creating a new one, I must specify mut if I want to change it, which I do. At least for this program, it is that simple.

The read_line() function returns an enum called Result with two variants Ok and Err. Instances of the Result type provide the expect() function. When called, if the variant is Err, this function will crash the program and return the error information provided.

Now that I was able to get the word I wanted to search, it was time to tackle the API call.

Task 2: Calling the API

To call an API I needed to choose one, and what better dictionary API to use than the first one that came up when I googled "Free dictionary API"? So that's what I used. The endpoint they provide is
https://api.dictionaryapi.dev/api/v2/entries/e/<word>. Now I just needed to figure out how to call it.

After searching the docs for two minutes I learned that I can make a GET request with the reqwest crate. This brought me on a side quest to determine what a crate was, but from what I can tell, it is essentially a group of modules. I will be thinking of them as being similar to packages or libraries in other languages until I am corrected.

I ran cargo add reqwest to add the dependency to my cargo.toml file. and at this point, I came across a small hurdle. The reqwest crate depends on openssl-sys on my operating system. Because openssl-sys is a C library and not Rust, cargo didn't handle this one for me automatically, and I needed to add openssl to my cargo.toml file manually, using the vendored option. But, after sorting this out it built no problem. I could also have just installed openssl on my machine, and this almost certainly won't happen to you, but it caused me enough grief to mention.

With everything configured, I wrote the data-fetching code as follows:

//line 1
let trimmed_word = word.trim();


//line 2, 3
let mut endpoint = "https://api.dictionaryapi.dev/api/v2/entries/en/"
let full_path = endpoint.to_owned() + trimmed_word;

//line 4
let mut definition_result = reqwest::blocking::get(&full_path)
    .expect("Unable to open"); 


//line 5
let mut result = String::new(); 

//line 6
definition_result.read_to_string(&mut result)
    .expect("failed to read to string");

println!("Response JSON: {result}");

Lines 1-3

Here I create a new variable. When the program gets input from the user, there may be white space surrounding it, but when we pass this value to the endpoint it can't have any. Luckily Strings in Rust have a built-in trim() function that removes white space. Surprisingly enough, however, this function doesn't return a String, it returns a reference to a string slice. This tripped me up a bit, so I will attempt to discuss the differences.

&str, and String

There are two string types often used in Rust: &str and String. At their base, they both act like pointers. A String is a reference to a set of bytes on the heap that has been reserved with a certain capacity. When the capacity is reached, reallocation occurs, meaning a String is mutable. A &str is a pointer to a str, which is a string slice, and can be on the heap. the stack, or in the binary of the program itself. Rust requires all variables to be of known size at compile time, but a str isn't, which is why we must use a pointer of known size (&str) to reference it. Unlike String, str is not dynamic and can't be changed or reallocated as long as the &str is in scope. Get that?

What this means for our line of code, is that endpoint cannot be changed, since it is a reference to a string splice (&str), and must be cast to a String using the to_owned() function before the concatenation can occur.

You might ask, as I did, why the trim function returns a &str in the first place. The reason is that a &str can point to a piece of an existing String. For example, if we have a String on the heap, &str can point to the second character and finish at the second last character. This would let the program have several effective strings without allocating additional memory to hold their contents.

If I consider the example with white space and have a String let word = " word ", then create a &str that points to the second character, with a length of 4, we get a new variable that will be read as word without allocating more memory.

The takeaway for me was that we have one type that saves space but is read-only or another that may cause allocation/reallocation but is mutable.

Line 4 - 5

The fourth line just calls the endpoint using reqwest::blocking::get, passing a reference of our trimmed, concatenated endpoint String. Since reqwest::blocking::get returns a Result enum, we include the expect function to call if the Result variant is Err. The Ok variant of this result is of type Response, which implements a function called read_to_string() that accepts a mutable reference to a String, and writes the Response value to it. This function in turn returns a Result enum, hence the expect function called on it to handle error cases.

So that's it, right? Well, I thought so, but no. There was a secret line of code missing this entire time: line 0. At the top of the file, we need a use statement.

use std::io::Read;

Line 0 and Traits

It baffled me why this use statement was required. I'm not calling a function on a Read type anywhere, after all. But then, I learned about traits. This is a feature of Rust that is reminiscent of an interface in other languages, like Java. Where a trait differs is that it can also provide the implementation of a function, as well as its signature, so can be used as a tool for code reuse.

If you look in the docs, you'll notice that reqwest::blocking::Response doesn't provide its own read_to_string() function. Instead, it implements the Read trait from std::io, where the read_to_string() function is defined. In Rust, you must have the trait in scope to call a method it provides, hence the required use statement. Luckily the cargo error messages are always extremely clear, which helped me to diagnose this quickly. After adding the use statement, the code built, and ran without issue, allowing me to search for a word and view the request result in JSON format.

Clean Up

After successfully fetching the data, I needed to clean it up so the information is more easily read by a user. The code I used to do this is as follows.

//line 1
let definition_information: Value = serde_json::from_str(&result)
    .expect("failed to convert from json");

let mut num_of_definitions: i32 = 1;
//line 2
for definition in definition_information[0]["meanings"][0]["definitions"]
    .as_array().expect("No definitions found"){

    println!("\nDefinition {}: \n", num_of_definitions);
    println!(" {}", definition["definition"]);
    num_of_definitions = num_of_definitions + 1;

}

Line 1

To print the data, we need to parse the data. The dictionary API returns its data in the form of JSON, and to parse this in Rust, we can use the serde_json crate. Serde will convert the JSON data to type Value, which is defined by serde_json::Value. We can then access the data as we need using square brackets.

Line 2

This line looks more intimidating than needed due to the extremely nested structure of the response I got from the API. Really what's happening here (definition_information[0]["meanings"][0]["definitions"].) is that I accessed the actual array of definitions by parsing through the rest of the JSON. It was nested in an array, in an object, in another array within the JSON for some reason.

After that, I just called to_array() on the value. This is because it is still of type serde_json::Value, and to loop through it as I did, I needed it to be a Vector or a reference to a Vector, which is what I can get with serde_json::Value::to_array() (almost).

In truth to_array() returns an Option,which is very similar to the Result type I discovered earlier, except it doesn't provide an Err, and instead just evaluates to none if there isn't a value. Luckily, both of these can be handled the same way with expect(), as can be seen in my code.

Finishing Up

This all leaves us with a program that returns the following output.

Enter word: 
rust

Definition 1: 

 "The deteriorated state of iron or steel as a result of moisture and 
   oxidation."

Definition 2: 

 "A similar substance based on another metal (usually with qualification,
 such as \"copper rust\")."

Definition 3: 

 "A reddish-brown color."

Definition 4: 

 "A disease of plants caused by a reddish-brown fungus."

Definition 5: 

 "Damage caused to stamps and album pages by a fungal infection."

Which is good enough for me, for now. All in all, I enjoyed the experience of building this program and learning some basic Rust. The features that were conceptually more difficult tended to make a case for themselves implicitly as I learned more about them, and I felt more or less babied by the cargo error messages as they carried me along, making the investigative process extremely straightforward. I look forward to building some more complicated Rust projects and leveraging more of the language's features.

You can find the completed code for this program on my GitHub here.