A Comprehensive Guide To Redis For Data Scientists

An online database is a database that can be easily accessed locally by using a local network or the internet. Instead of storing data directly to the desktop and its attached storage, online databases are hosted on websites and through the cloud model & hence providing Software as a Service(SaaS) on the web browser. These web-based applications can be free or may require payment, usually as monthly and yearly subscriptions. You pay for what you use; therefore, the amount of server space required can be modified to one’s needs accordingly. The hassle of installing an entire software gets forked out, too, as everything is maintained on the cloud. Information is accessible from almost any device at any time. As everything is stored in a cloud, that means not being stuck to just one computer. As long as access is granted, you can technically get a hold of the data from just any compatible device. Such databases also come with in-house technical support, serving 24×7 & 365 days. Some examples of such databases are the Oracle Database, IBM Db2 and the infamous Amazon DynamoDB.

What is Redis?

Redis, also known as Remote Dictionary Server, is a super-fast, open-source and in-memory key-value data store created to be used as a database, cache manager, message broker, and queue. All the Redis data resides in memory, contradicting other databases that store data on disk or SSDs. By eliminating the need to access disks, in-memory data stores such as Redis avoid response time delays and access any microsecond data. This also means that while Redis supports mapped key-value-based strings to store and retrieve data parallel to the data model supported in traditional kinds of databases, it also supports other complex data structures like lists, sets, etc. Some of the top features of Redis are versatile data structures, high availability, geospatial, Lua scripting, transaction management, on-disk persistence, and cluster support, making it simpler to teach with real-time internet-scale apps. Redis is a type of database that’s commonly referred to as No SQL or Non-relational. Therefore in Redis, there are no tables and no database defined way of relating data in Redis with other data. Using Redis instead of a common relational or other primarily on-disk databases, you can avoid writing unnecessary temporary data and avoid needing to scan all over and delete the temporary data, therefore ultimately improving the performance required.

How Does Redis Work?

Redis works by mapping keys to values with a sort of predefined data model. It also uses a method known as Sharding, through which you partition your data into different pieces. It partitions the data based on IDs embedded in the keys, based on the hash of keys, or some combination of the two. By partitioning your data, you can store and fetch the data from multiple machines, allowing a linear scaling in performance for certain problem domains. 

Even though Redis performs very well under the most uncertain circumstances due to its in-memory design, there are situations where one might need Redis to process more read queries than a single Redis server can handle. Therefore, Redis supports master/slave server instance replication to support such higher rates of reading performance and handling for server failure that Redis is running on. Redis supports master/slave server instance replication, where the slaves are connected to the master and receive an initial copy of the full database. So as the master writes the data, it is sent to all the connected slaves as well, and they are updated in real-time. With continuously updated slaves, clients can connect to any slave for reading data in case of a crash or connectivity issue with the master server. To reduce load, it can also be combined with another database.

Supported Data Types

Some of the supported Data Types in Redis are :

  • Strings: Allows to operate on the whole string, parts, increment/decrement the integers and floats
  • Lists: Allows you to Push or Pop items from both ends, read individual and multiple items, find and remove items using their value
  • Sets & Sorted Sets: Allows to Add, Fetch or remove individual and random items, check membership, intersect, create unions and difference
  • Hashes: Add, Fetch or remove items or fetch the whole hash
  • Bitmaps
  • Hyper Logs
  • Geospatial Indexes

What Languages does Redis Support?

Redis simplifies code by allowing us to write fewer lines of code to store, access and use in data for the applications created. Multiple supported data structures can be used to store data in the data store with just a few code lines and comes with options to manipulate and interact with the data. Over more than a hundred open source clients are available for developers using Redis. 

Languages supported include :

  • Java
  • Python
  • PHP
  • C,C++ and C#
  • Javascript
  • Node.js
  • Ruby
  • R
  • Go and many more.

High Availability and Scalability of Redis

Redis offers a primary replication of architecture in a single node primary or a clustered topology. This allows the user to build highly available solutions and maintain consistent performance and reliability. In addition, there are a whole bunch of options available to adjust the cluster size, either to be upscaled or scale in or out. This allows the user the flexibility to grow the cluster according to his demands and uses.

Caching using Redis

Redis is always a better option for implementing a highly available, in-memory cache to decrease data access latency and increase throughput. This is because Redis can serve the frequently requested data items in a matter of milliseconds. Caching is the process of storing similar copies of files in a temporary storage location to be accessed instantly. Technically, a cache is any temporary storage location for copies of files or data, but the term is often used in reference to Internet technologies. The database queries allow caching, persistent session caching, web page caching and caching of frequently requested objects such as images, files and metadata. They hence are the popular examples of caching with Redis. With the capacity to designate how long you want to keep data and which data to evict first, Redis enables a series of intelligent caching patterns.

Data Expiration and Eviction using Redis

Data structures in Redis can be tagged with a Time to Live, set accordingly in seconds, after which they will get removed from the database. A series of configurable eviction policies are available to choose from. Making use of Time to Live, impermanent marked data can be considered before other data that does not have Time to Live, allowing us to create a tiered hierarchy of memory objects. The least recently used or least frequently used object would make much more sense to be evicted.

Geospatial Features of Redis

Redis is richly loaded with geospatial index data structures and commands to make use of. These built in-memory data structures and operators can help manage real-time geospatial data at scale and speed. For example, the latitude and longitude coordinates are stored, and users can calculate the distance between the objects or query for objects within a given radius of a point. Implementing these commands return values in multiple formats such as feet, kilometres, etc.

The speed of Redis allows these data points to be updated quickly. Hence, it can be implemented in ridesharing applications to connect with nearby drivers and provide real-time updates as they travel through when made proper use of.

Some Use Cases of Redis

As modern data-based applications require machine learning to process massive amounts of rapidly moving data quickly, Redis can be a saviour for such high-velocity processing. The ability to process, build, train and deploy machine learning models through faster processings makes Redis the ideal choice for such use cases. Redis can also be used with steaming solutions such as Apache Kafka and Amazon Kinesis to ingest and process real-time data with low latency. It can also be used for social media analytics, ad targeting, personalization and IoT.

Using Redis

For demonstration purposes, I have implemented the following code on the Redis website through its tutorial UI. You can implement the following code on your system by installing the Redis-CLI. You can download the CLI through the link here.

Setting a key

As an example, we’ll first set a server name as “victor” with the key as “Hello World”,

> SET victor "HELLO WORLD"

We will get the following output as Ok if the key is set,

OK

Getting the Key

To retrieve the set key, we will be using the following command.

> GET victor

"HELLO WORLD"

Deleting the Key

To delete our created key, we can use the following command.

> DEL victor

(integer) 1 #key got deleted

Cross validating the result 

> GET victor

(nil)

Hence we can confirm that our key is now completely removed.

Setting a Key with Time to Live

We can also set keys with an expiry time, the time will be set in seconds, and key will be removed from the server after is crosses the set time.

 > SETEX victor 40 "I said, Hello World!"
 #key set with 40 seconds as time limit
 OK
Checking Time to Live 

You can also check the time remaining from the set time to expire.

> TTL victor

(integer) 36

Renaming our Key

Keys can be renamed using the following command.

 > RENAME victor bar  #renaming victor as bar
 OK 
Flushing the Key

Flushing everything saved so far.

> flushall

OK #just got flushed

EndNotes

Through this article, we tried to know what Redis is and what it is capable of. We also tried to explore its use cases, run basic Redis database commands, and check its functionalities. Redis offers highly performant and efficient read and writes via its optimizations. Therefore, I would recommend exploring the Redis database further and implementing it for its immense capabilities.

Happy Learning!

References 

The post A Comprehensive Guide To Redis For Data Scientists appeared first on Analytics India Magazine.