How to Create a DynamoDB Table on AWS

0
602
How to Create a DynamoDB Table on AWS

This article will show you how to create your first DynamoDB table. Amazon DynamoDB is the primary database in AWS for building serverless applications. DynamoDB is a fully managed NoSQL database and you do not have to manage any servers. Unlike most NoSQL databases, DynamoDB also supports consistent reads but with an additional cost.

Attributes in DynamoDB are synonymous with columns, and items are synonymous with rows in a relational database. However, there is no table-level schema in DynamoDB. You can have a different set of attributes in different items (rows). You can also have an attribute with the same name but different types in different items. Getting ready

Table Example

In this tutorial let’s create a table for global FoxuTech team registry. We want to track name, shortname, team, colors, and location. For example, the FoxuTech would contain the following:

  1. Name – foxutech
  2. ShortName – FOX
  3. Team – Technology
  4. Colors – Yellow, Gold
  5. Location – Bangalore, IN

Before loading our data into the table, we are going create a table and assign data types to the items.

DynamoDB Create Table from Console

Let’s walk through the steps to creating a table in DynamoDB to track FoxuTech Teams from the AWS Console. Make sure to log in and navigate to the DynamoDB service.

  1. Create Table – Our table name is ‘foxteam‘. The table name must be unique per AWS region.
  2. Partition Key (Primary Key) – Primary Key is a combination of the Partition Key (that we are entering here) and Sort Key (if sort key is used). Since we are using both Parition Key and Sort Key DynamoDB will hash the keys across the AWS region. For our FoxuTech teams table our Partition Key for sorting our teams is ‘name’ (string).
  3. Add Sort Key -The sort key is additional key to enrich queries associated with the Primary Key. All queries will be associated with the Primary Key.  For the FoxuTech teams table, we have ‘name’ as our primary key, lets’ add the ‘location’ as our sort key.
  4. Default Settings/Options – Ready to create if not let’s discuss the advanced settings below
    1. Secondary Indexes – Remember we added additional sort keys tied to our Primary Key? We can use Secondary Indexes when we want to setup indexes independent of the Primary Key. For the FoxuTech team tables let’s set a secondary index with ‘team’ (string) as the Partition Key (Primary Key) and ‘colors’ (string) as the Sort Key. The team secondary index will have separate read/write capacity units from the primary index.
    2. Provision Capacity – Default setting is for 5 RCUs (Read Capacity Units) and 5 WCUs (Write Capacity Units). To change provisioned capacity, you must disable auto scaling. We will leave the provision capacity with the default setting.
    3. Auto Scaling – Enables DynamoDB Developer to configure how AWS will scale RCUs and WCUs to create elastic tables. We will leave the Auto Scaling enabled and keep the default settings.
    4. Create – Now we are ready to create our table.
    5. Items – After creating table we can navigate to Items and create items. Let’s create our first item for FoxuTech. Ensure to match the correct data types for each item.
      1. Name (string) – FoxuTech
      2. ShortName (string) – FOX
      3. Team  (string)- Technology
      4. Colors (string) – Yellow, Gold
      5. Location (string) – Bangalore, IN
    6. Save – Hit the Save button and now we have just created a FoxuTech team DynamoDB table.

DynamoDB Table Structure

Creating a table using CLI commands

You need a working AWS account and should have installed and configured the AWS CLI with a profile with the necessary permissions. You are also expected to have a decent understanding of AWS CLI commands, Amazon Cloud Formation, and basic database concepts. For the complete code files for this article, you can refer to:

#1 – We can create a simple DynamoDB table using the aws dynamodb create-table CLI command as follows:

aws dynamodb create-table \
--table-name my_table \
--attribute-definitions 'AttributeName=id, AttributeType=S' 'AttributeName=datetime, AttributeType=N' \
--key-schema 'AttributeName=id, KeyType=HASH' 'AttributeName=datetime, KeyType=RANGE' \
--provisioned-throughput 'ReadCapacityUnits=5, WriteCapacityUnits=5' \
--region us-east-1 \
--profile admin

Here, we define a table named my_table and use the attribute-definitions property to add two fields: id of type string (denoted by S) and `datetime of type number (denoted by N). We then define a partition key (or hash key) and a sort key (or range key) using the key-schema property. We also define the maximum expected read and write capacity units per second using the provisioned-throughput property. I have specified the region even though us-east-1 is the default.

#2 – List tables using the aws dynamodb list-tables CLI command to verify that our table was created:

aws dynamodb list-tables \
--region us-east-1 \
--profile admin

#3 – Use the aws dynamodb describe-table CLI command to see the table properties:

aws dynamodb describe-table \
--table-name my_table \
--profile admin

The initial part of the response contains the table name, attribute definitions, and key schema definition we specified while creating the table:

The latter part of the response contains TableStatus, CreationDateTime, ProvisionedThroughput, TableSizeBytes, ItemCount, TableArn and TableId:

#4 – You may use the aws dynamodb update-table CLI command to update the table:

aws dynamodb update-table \
--table-name my_table \
--provisioned-throughput 'ReadCapacityUnits=10, WriteCapacityUnits=10' \ --profile admin

Finally, you may delete the table using aws dynamodb delete-table:

aws dynamodb delete-table \
--table-name my_table \
--profile admin

How it works…

We used the following DynamoDB CLI command actions in this recipe: create-table, list-tables, describe-table, update-table, and delete-table. We use the corresponding components and properties within our Cloud Formation template as well. Some of these options will become clear after you read the following notes.

DynamoDB data model

Data in DynamoDB is stored in tables. A table contains items (like rows) and each item contains attributes (like columns). Each item can have a different set of attributes and the same attribute names may be used with different types in different items. DynamoDB supports the datatypes string, number, binary, Boolean, string set, number set, binary set, and list. It does not have a JSON data type; however, you can pass JSON data to DynamoDB using the SDK and it will be mapped to native DynamoDB data types. You can also define indexes (global secondary indexes and local secondary indexes) to improve read performance.

Data model limits

The following are some of the important limits in the DynamoDB data model:

  • There is an initial limit of 256 tables per region for an AWS account, but this can be changed by contacting AWS support.
  • Names for tables and secondary indexes must be at least three characters long, but no more than 255 characters. Allowed characters are A-Z, a-z, 0-9, _ (underscore), – (hyphen), and . (dot).
  • An attribute name must be at least one character long but no greater than 64 KB long. Attribute names must be encoded using UTF-8, and the total size of each encoded name cannot exceed 255 bytes.
  • The size of an item, including all the attribute names and attribute values, cannot exceed 400 KB.
  • You can only create a maximum of five local secondary indexes and five global secondary indexes per table.

DynamoDB keys and partitions

Each item is identified with a primary key, which can be either only the partition key if it can uniquely identify the item or a combination of partition key and sort key. The partition key is also called a hash key and the sort key is also called a range key. Primary key attributes (partition and sort keys) can only be string, binary, or number.

Initially, a single partition holds all table data. When a partition’s limits are exceeded, new partitions are created and data is spread across them. Current limits are 10 GB storage, 3,000 RCU, and 1,000 WCU. Data belonging to one partition key is stored in the same partition; however, a single partition can have data for multiple partition keys. The partition key is used to locate the partition and the sort key is used to order items within that partition.

Read and write capacity units

We specified the maximum read and write capacity units for our application per second, referred to as read capacity unit (RCU) and write capacity unit (WCU). We also updated our RCU and WCU. Updating the table properties is an asynchronous operation and may take some time to take effect.

Waiting for asynchronous operations

The CLI commands create-table, update-table, and delete-table are asynchronous operations. The control returns immediately to the command line but the operation runs asynchronously.

To wait for table creation, you can use the aws dynamodb wait table-exists –table <table-name> command, which polls the table until it is active. The wait table-exists command may be used in scripts to wait until the table is created before inserting data. Similarly, you can wait for table deletion using the aws dynamodb wait table-not-exists –table <table-name> command, which polls with describe-table until ResourceNotFoundException is thrown. Both the wait options poll every 20 seconds and exit with a 255 return code after 25 failed checks.

Other ways to create tables

We created our table by specifying the properties, such as attribute-definitions, key-schema, provisioned-throughput, and so on. Instead, you can specify a JSON snippet or JSON file using the cli-input-json option. The generate-cli-skeleton option returns a sample template as required by the cli-input-json option.

You can also create DynamoDB tables from Java code using the AWS SDK. However, in most real-world cases, CloudFormation templates are used to create and provision tables and the AWS SDK is used to work with data items.

There’s more…

Let’s first see some features and limitations of DynamoDB. We will also see some theory on the LSI and GSI.

DynamoDB features

The following are some of the important features of DynamoDB:

  • DynamoDB is a fully managed NoSQL database service. There are no servers to manage.
  • DynamoDB has the characteristics of both the key-value and the document-based NoSQL families.
  • Virtually no limit on throughput or storage. It scales very well but according to the provisioned throughout configuration.
  • DynamoDB replicates data into three different facilities within the same region for availability and fault tolerance. You can also set up cross-region replication manually.
  • It supports eventual consistency reads as well as strongly consistent reads.
  • DynamoDB is schemaless at the table level. Each item (rows) can have a different set of elements. Even the same attribute name can be associated with different types in different items.
  • DynamoDB automatically partitions and re-partitions data as the table grows in size.
  • You can store JSON and then do nested queries on that data using the AWS SDK.
  • Data is stored on SSD storage.
  • DynamoDB supports atomic updates and atomic counters.
  • DynamoDB supports conditional operations for put, update, and delete.

DynamoDB general limitations

Here are some of the general limitations of DynamoDB:

  • DynamoDB does not support complex relational queries such as joins or complex transactions.
  • DynamoDB is not suited for storing a large amount of data that is rarely accessed. S3 may be better suited for such use cases.
  • You cannot select the Availability Zone for your DynamoDB table.
  • Default replication of data for availability and fault tolerance is only within a region.
  • Local and global secondary indexes
  • You can define LSI and GSI for your tables to improve the read performance. An LSI can be considered as an alternate sort key for a given partition-key value. A GSI contains attributes from the base table and organizes them by a primary key that is different from that of the base table.
  • Secondary indexes are useful when you want to query based on non-key parameters. You can create them with the CLI as well as CloudFormation templates. There is a limit of five LSIs and five GSIs per table.

You can read and learn more about LSIs and GSIs from the following links:

 https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html     

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html

NO COMMENTS