Detailed description of the definition, function and usage of "NoSQL"

This is a minimalistic demo designed to introduce the concept of NoSQL.

The term NoSQL has become increasingly popular in recent years. But what exactly does "NoSQL" mean? How and why is it useful? In this article, we'll use pure Python (which I prefer to call "Light Structure" pseudocode) to build a NoSQL database and answer these questions.

OldSQL

In many cases, SQL has become synonymous with "database." However, SQL stands for Structured Query Language, not the database technology itself. It refers to RDBMS (Relational Database Management System), a language used to retrieve data. MySQL, MS SQL Server, and Oracle are all examples of RDBMS.

The "R" in RDBMS stands for "Relational," which is the most significant part of the system. Data is organized into tables, each consisting of columns with specific types. The schema defines the structure of the database, describing each table's columns and their types. For example, a table called Car might have the following columns:

Make: a string

Model: a string

Year: a four-digit number; alternatively, a date

Color: a string

VIN (Vehicle Identification Number): a string

Each entry in a table is called a row or record. To uniquely identify each record, a primary key is usually defined. In the Car table, VIN is a natural choice as the primary key because it guarantees that each car has a unique identifier. Two different rows may have the same values in the Make, Model, Year, and Color columns, but for different cars, there will definitely be different VINs. Conversely, if two rows have the same VIN, we don't need to check other columns to know they refer to the same car.

Querying

SQL allows us to extract useful information by querying the database. A query is a structured request to an RDBMS to return specific rows as the answer to a question. Suppose the database represents all registered vehicles in the US. To get all records, you can execute the following SQL query on the database:

SELECT Make, Model FROM Car;

Translating SQL roughly into Chinese: "SELECT" means "Show me", "Make, Model" means "Make and Model values", and "FROM Car" means "Every line in the table Car". That is, "Show me the value of Make and Model in each row of the table Car". After executing the query, you will get the results of some queries, each of which is Make and Model. If you only care about the color of cars registered in 1994, then you can run:

SELECT Color FROM Car WHERE Year = 1994;

At this point, you will get a list similar to the following:

Black

Red

White

Blue

Black

White

Yellow

Finally, you can specify a car using the primary key of the table, here VIN:

SELECT * FROM Car WHERE VIN = '2134AFGER245267'

The above query will return the attribute information of the specified vehicle.

The primary key is defined as unique and non-repeating. That is, a vehicle with a specified VIN can only appear at most once in the table. This is very important. Why? Consider an example:

Relations

Suppose you are running a car repair business. In addition to other necessary things, you also need to track the service history of a car, that is, all the trim records on the car. Then you may create a ServiceHistory table with the following columns:

In this way, each time the vehicle is repaired, we add a new line to the table, recording the service details, such as who did the work, how much it cost, and how long it took.

But wait, we all know that for the same car, the information about the vehicle's own information doesn't change. In other words, if I service my Black 2014 Lexus RX 350 ten times, the information about Make, Model, Year, and Color remains the same each time. Storing this information repeatedly is inefficient and leads to unnecessary duplicate records. A more reasonable approach is to store this information only once and query it when needed.

So what should I do? We can create a second table: Vehicle, with the following columns:

VIN | Make | Model | Year | Color

In this way, for the ServiceHistory table, we can reduce it to the following columns:

VIN | Service Performed | Mechanic | Price | Date

You might ask, why does VIN appear in both tables at the same time? Because we need a way to confirm that the car in the ServiceHistory table refers to the car in the Vehicle table, that is, we need to confirm that the two records in the two tables represent the same car. In this case, we only need to store each car's own information once. Each time the vehicle comes to repair, we create a new row in the ServiceHistory table without having to add a new record to the Vehicle table. After all, they refer to the same car.

We can use the SQL query statement to expand the implicit relationship contained in the two tables of Vehicle and ServiceHistory:

SELECT Vehicle.Model, Vehicle.Year FROM Vehicle, ServiceHistory WHERE Vehicle.VIN = ServiceHistory.VIN AND ServiceHistory.Price > 75.00;

The query is designed to find Model and Year for all vehicles with a repair cost greater than $75.00. Notice that we match the records that meet the criteria by matching the VIN values in the Vehicle and ServiceHistory tables. The returned records will be some of the two tables that meet the criteria. "Vehicle.Model" and "Vehicle.Year" mean that we only want these two columns in the Vehicle table.

If our database does not have indexes, the above query would need to perform a table scan to locate the rows that match the query requirements. A table scan checks each row in the table in order, which is usually very slow. In fact, table scan is actually the slowest of all queries.

You can avoid scanning the table by indexing the columns. We can think of the index as a data structure that allows us to quickly find a specified value (or some value within a specified range) on the indexed column by pre-sorting. That is, if we have an index on the Price column, so you don't need to scan the entire table line by line to determine whether the price is greater than 75.00, but only need to use the information contained in the index to "jump" to the first line with a price higher than 75.00, and return each subsequent line (since the index is ordered, so the price of these lines is at least 75.00).

Indexing is an indispensable tool for increasing the speed of queries when dealing with large amounts of data. Of course, as with everything, there is a certain amount of loss. Using an index can lead to some extra consumption: The data structure of the index consumes memory, which can be used to store data in the database. This requires us to weigh the pros and cons and seek a compromise, but it is very common to index the columns that are frequently queried.

The Clear Box

Thanks to the database's ability to examine the schema of a table (depicting what type of data each column contains), advanced features like indexes can be implemented and can make a reasonable decision based on the data. In other words, for a database, a table is actually an antonym of a "black box" (or a transparent box)?

Keep this in mind when we talk about NoSQL databases. This is also a very important part when it comes to querying the capabilities of different types of database engines.

Schemas

We already know that the schema of a table describes the name of the column and the type of data it contains. It also includes other information, such as which columns can be empty, which columns do not allow duplicate values, and other restrictions on the columns in the table. A table can only have one schema at any time, and all rows in the table must comply with the schema.

This is a very important constraint. Suppose you have a database of tables with millions of consumer information. Your sales team wants to add additional information (for example, the age of the user) to improve the accuracy of their email marketing algorithms. This requires an alter table to be added â€” add a new column. We also need to decide if each row in the table requires that the column must have a value. Often, it makes sense to have a column with a value, but doing so may require information that we can't easily get (such as the age of each user in the database). Therefore, at this level, some trade-offs are also needed.

In addition, making changes to a large database is usually not a trivial matter. In order to prevent errors, it is very important to have a rollback solution. But even so, once the schema is changed, we are not always able to revoke these changes. Maintenance of the schema may be one of the most difficult parts of the DBA's work.

Key/Value Stores

Prior to the word "NoSQL," key/value data stores like memcached provided data storage without the need for a table schema. In fact, there is no concept of "table" at all when K/V is stored. There are only keys and values. If the key-value store sounds familiar, it may be because the concept is consistent with Python's dict and set: using a hash table to provide a basis for fast data query for keys. A primitive Python-based NoSQL database, in simple terms, is a large dictionary.

In order to understand how it works, hand-write one automatically! Let's first look at some simple design ideas:

a Python dict as the primary data store

Only string type is supported as a key (key)

Support for storing integer, string and list

A simple TCP/IP server using ASCII string to deliver messages

Some advanced commands like INCREMENT, DELETE, APPEND, and STATS (command)

One advantage of having an ASCII-based TCP/IP interface for data storage is that we can interact with the server using a simple telnet program and don't need a special client (although this is a very good exercise and only needs 15 lines of code can be done).

For the return information we send to the server and other, we need a "wired format." Here's a simple description:

Commands Supported

PUT

Parameters: Key, Value

Purpose: Insert a new entry into the database

GET

Parameters: Key

Purpose: Retrieve a stored value from the database

PUTLIST

Parameters: Key, Value

Purpose: Insert a new list entry into the database

APPEND

Parameters: Key, Value

Purpose: Add a new element to an existing list in the database

INCREMENT

Parameters: key

Purpose: To grow an integer value in the database

DELETE

Parameters: Key

Purpose: Delete an entry from the database

STATS

Parameters: None (N/A)

Purpose: Request statistics for success/failure of each executed command

Now let's define the structure of the message itself.

Message Structure

Request Messages

A Request Message contains a command (command), a key (key), a value (value), and a type of value. The last three types are optional, non-essential. ; is used as a separator. Even if the above options are not included, there must still be three characters in the message;

COMMAND; [KEY]; [VALUE]; [VALUE TYPE]

COMMAND is one of the commands in the list above

KEY is a string that can be used as a database key (optional)

VALUE is an integer in the database, list or string (optional)

List can be represented as a string separated by commas, for example, "red, green, blue"

VALUE TYPE describes why VALUE should be interpreted why

Possible type values are: INT, STRING, LIST

Examples

"PUT; foo; 1; INT"

"GET; foo;;"

"PUTLIST; bar; a,b,c ; LIST"

"APPEND; bar; d; STRING"

"GETLIST; bar; ;"

STATS; ;;

INCREMENT; foo;;

DELETE; foo;;

Reponse Messages

A response message (Reponse Message) consists of two parts, separated by ; The first part is always True|False , which depends on whether the command being executed was successful. The second part is the command message, and when an error occurs, an error message is displayed. For those commands that execute successfully, if we don't want the default return value (such as PUT), a success message will appear. If we return the value of a successful command (such as GET), then the second part will be its own value.

Examples

True; Key [foo] set to [1]

True; 1

True; Key [bar] set to [['a', 'b', 'c']]

True; Key [bar] had value [d] appended

True; ['a', 'b', 'c', 'd']

True; {'PUTLIST': {'success': 1, 'error': 0}, 'STATS': {'success': 0, 'error': 0}, 'INCREMENT': {'success': 0, 'error': 0}, 'GET': {'success': 0, 'error': 0}, 'PUT': {'success': 0, 'error': 0}, 'GETLIST': {'success ': 1, 'error': 0}, 'APPEND': {'success': 1, 'error': 0}, 'DELETE': {'success': 0, 'error': 0}}

Show Me The Code!

I will show all the code in the form of a block summary. The entire code is only 180 lines, and it won't take long to read.

Set Up

Here are some of the boilerplate code we need for our server:

"""NoSQL database written in Python"""

# Standard library imports

Importophone

HOST = 'localhost'

PORT = 50505

SOCKET = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

STATS = {

'PUT': {'success': 0, 'error': 0},

'GET': {'success': 0, 'error': 0},

'GETLIST': {'success': 0, 'error': 0},

'PUTLIST': {'success': 0, 'error': 0},

'INCREMENT': {'success': 0,'error': 0},

'APPEND': {'success': 0, 'error': 0},

'DELETE': {'success': 0, 'error': 0},

'STATS': {'success': 0, 'error': 0},

}

It's easy to see that the above is just a package import and some data initialization.

Set up(Cont'd)

Next I will skip some code so I can continue to show the rest of the code in the preparation section above. Note that it involves some functions that don't exist yet, but that's okay, we'll cover later. In the full version (which will be presented at the end), everything will be organized in an orderly manner. Here is the remaining installation code:

COMMAND_HANDERS = {

'PUT': handle_put,

'GET': handle_get,

'GETLIST': handle_getlist,

'PUTLIST': handle_putlist,

'INCREMENT': handle_increment,

'APPEND': handle_append,

'DELETE': handle_delete,

'STATS': handle_stats,

}

DATA = {}

Defmain():

"""Main entry point for script"""

SOCKET.bind(HOST,PORT)

SOCKET.listen(1)

While1:

Connection,address = SOCKET.accept()

Print('New connection from [{}]'.format(address))

Data = connection.recv(4096).decode()

Command,key,value = parse_message(data)

Ifcommand == 'STATS':

Response = handle_stats()

Elifcommand in('GET','GETLIST','INCREMENT','DELETE'):

Response = COMMAND_HANDERS[command](key)

Elifcommand in(

'PUT',

'PUTLIST',

'APPEND',):

Response = COMMAND_HANDERS[command](key,value)

Else:

Response = (False, 'Unknown command type {}'.format(command))

Update_stats(command,response[0])

Connection.sandall('{};{}'.format(response[0],response[1]))

Connection.close()

If__name__ == '__main__':

Main()

We created COMMAND_HANDLERS, which is often referred to as a look-up table. The job of COMMAND_HANDLERS is to associate commands with the functions used to process the command. For example, if we receive a GET command, COMMAND_HANDLERS[command](key) is equivalent to handle_get(key) . Remember, in Python, a function can be thought of as a value and can be like any other value. Stored in a dict.

In the above code, although some commands request the same parameters, I still decided to process each command separately. Although it is simple and rude to force all handle_ functions to accept a key and a value, I hope that these handlers are more organized, easier to test, and less likely to cause errors.

Note that the socket-related code is very minimal. Although the entire server is based on TCP/IP communication, there is not much underlying network interaction code.

Finally, there is a small point to note: the DATA dictionary, because this point is not very important, so you are likely to miss it. DATA is the key-value pair that is actually stored, and it is they that actually make up our database.

Command Parser

Let's look at some command parser, which is responsible for interpreting the received message:

Defparse_message(data):

"""Return a tuple containing the command, the key, and (optionally) the

Value cast to the appropriate type."""

Command,key,value,value_type = data.strip().split(';')

Ifvalue_type:

Ifvalue_type == 'LIST':

Value = value.split(',')

Elifvalue_type == 'INT':

Value = int(value)

Else:

Value = str(value)

Else:

Value = None

Returncommand,key,value

Here we can see that a type conversion has occurred. If we want the value to be a list, we can get the value we want by calling str.split(',') on the string. For int, we can simply use int() with argument as string. The same is true for strings and str().

Command Handlers

Below is the code for the command handler. They are all very intuitive and easy to understand. I noticed that although there are a lot of error checks, they are not all-inclusive and very complicated. In the process of reading, if you find any errors, please move on to discuss.

Defupdate_stats(command,success):

"""Update the STATS dict with info about if executing *command* was a

*success*"""

Ifsuccess:

STATS[command]['success'] += 1

Else:

STATS[command]['error'] += 1

Defhandle_put(key,value):

"""Return a tuple containing True and the message to send back to the

Client."""

DATA[key] = value

Return(True,'key [{}] set to [{}]'.format(key,value))

Defhandle_get(key):

"""Return a tuple containing True if the key exists and the message to send

Back to the client"""

Ifkey notinDATA:

Return(False,'Error: Key [{}] not found'.format(key))

Else:

Return(True,DATA[key])

Defhandle_putlist(key,value):

"""Return a tuple containing True if the command succeeded and the message

To send back to the client."""

Returnhandle_put(key,value)

Defhandle_putlist(key,value):

"""Return a tuple containing True if the command succeeded and the message

To send back to the client"""

Returnhandle_put(key,value)

Defhandle_getlist(key):

"""Return a tuple containing True if the key contained a list and the

Message to send back to the client."""

Return_value = exists, value = handle_get(key)

Ifnotexists:

Returnreturn_value

Elifnotisinstance(value,list):

Return(False,'ERROR: Key [{}] contains non-list value ([{}])'.format(

Key,value))

Else:

Returnreturn_value

Defhandle_increment(key):

"""Return a tuple containing True if the key's value could be incremented

And the message to send back to the client."""

Return_value = exists, value = handle_get(key)

Ifnotexists:

Returnreturn_value

Elifnotisinstance(list_value,list):

Return(False,'ERROR: Key [{}] contains non-list value ([{}])'.format(

Key,value))

Else:

DATA[key].append(value)

Return(True,'Key [{}] had value [{}] appended'.format(key,value))

Defhandle_delete(key):

"""Return a tuple containing True if the key could be deleted and the

Message to send back to the client."""

Ifkey notinDATA:

Return(

False,

'ERROR: Key [{}] not found and could not be deleted.'.format(key))

Else:

DelDATA[key]

Defhandle_stats():

"""Return a tuple containing True and the contents of the STATS dict."""

Return(True,str(STATS))

There are two things to note: multiple assignments and code reuse. Some functions are just simple wrappers for existing functions for more logic, such as handle_get and handle_getlist. Since we sometimes just need an existing function The return value, but at other times you need to check what the function returns, and then use multiple assignments.

Let's take a look at handle_append . If we try to call handle_get but the key doesn't exist, then we simply return what the handle_get returns. In addition, we also want to be able to reference the tuple returned by handle_get as a separate return value. Then when the key does not exist, we can simply use return return_value.

If it does exist, then we need to check the return value. Also, we want to be able to reference the return value of handle_get as a separate variable. In order to be able to handle both cases, and consider the case where we need to separate the results separately, we use multiple assignments. This way, you don't have to write multiple lines of code while keeping the code clear. Return_value = exists, list_value = handle_get(key) can explicitly indicate that we are

PET Self Closing Sleeve

PET self-closing wrap is a type of protective covering for cables or wires that is made from PET (polyethylene terephthalate) material. It is designed with a braided construction that allows for flexibility and expandability to accommodate various cable sizes.

The self-closing feature refers to the sleeve's ability to automatically close around the cables once they are inserted into the sleeve. This eliminates the need for additional fasteners or ties to secure the sleeve in place.

The braided construction provides excellent abrasion resistance and protection against external factors such as chemicals, UV rays, and heat. It also helps to organize and streamline cables, reducing clutter and tangling.

PET self-closing braided sleeves are commonly used in industries such as automotive, electronics, and telecommunications to protect and manage cables in applications where flexibility, durability, and ease of installation are important.

PET Self Closing Sleeve,Self Closing Wrap,PET Self Closing Sleeving,PET Self Closing Cable Sleeve,S7 Self Closing Sleeve

Dongguan Liansi Electronics Co.,Ltd , https://www.liansisleeve.com