By Karthik Ranganathan, InfoWorld |

About |

Emerging tech dissected by technologists

The 9 most important new features in YugabyteDB

YugabyteDB 2.13 brings materialized views, local reads for performance, region-local backups, and much more, extending the geo-distribution capabilities of the database.

The rich deployment and replication options in YugabyteDB are must-haves for any developer or technologist building a modern, distributed cloud application. In geo-distributed environments, performance, developer friendliness, and compliance are key requirements that organizations must design into their data layer—to avoid the trap of assuming they will magically exist simply because an application is deployed in the cloud.

YugabyteDB 2.13 is the latest release, delivering better control over where data is stored and accessed, an essential feature for modern applications in a geo-distributed environment. The native capabilities allow enterprises to lower data transfer costs, improve performance, and ensure compliance with regulatory requirements. The latest version of YugabyteDB extends the geo-distribution capabilities of the database, adding new features that enhance performance, increase control over backups, and intelligently utilize local data for reads.

Today, the YugabyteDB distributed SQL database helps thousands of developers accelerate cloud native agility, lower costs, and reduce risks without vendor lock-in. This enables them to focus on business growth and spend less time on complex data infrastructure management. Let’s take a look at nine key features in the latest YugabyteDB release.

Region-local data and backups in the cloud

YugabyteDB distributes and stores data within geographic regions to help organizations with data domiciling regulations such as GDPR in Europe. We expect additional legal jurisdictions to pass similar laws in the coming year. This means that modern database management systems must deliver simple, native functionality to assist organizations in meeting new and updated compliance requirements.

YugabyteDB 2.13 allows organizations to control where database backups are located by explicitly limiting them to specific geographic regions. Based on the data locality defined during the table creation, each TServer writes files only to the backup destination that matches the region configured.

In addition to meeting these data domiciling requirements, keeping the data within cloud regions reduces cloud data transfer costs by avoiding cross-regional data copying.

Better performance for region-local transactions

A “transaction status” table tracks the status of transactions. Under the covers, this table is just another sharded table in the system. However, it does not use RocksDB, instead storing all of its data in memory, backed by the Raft WAL.

In order to achieve the A (atomicity) in ACID transactions, along with data operations, we also make transaction status changes atomic. Because this transaction status table is stored as global, it could become a bottleneck for transactions on geo-partitioned data.

In YugabyteDB 2.13, the global transaction status table is optimized for access from different regions. As the transaction status table is also geo partitioned, it eliminates the need for a round trip to remote regions, and reduces query latency by keeping relevant metadata close to users.

YugabyteDB automatically creates a transaction status table using the user’s table placement information. However, you can also create a transaction status table manually. To do this, use the create transaction status command, followed by modify_table_placement_info, to set the placement information for the newly created transaction status table.

Materialized views

A materialized view is a pre-computed data set derived from a query specification and stored for later use.

Because the data is pre-computed, querying a materialized view directly is faster than executing a query against the base table of the view. Materialized views can also significantly improve the performance of workloads that have the characteristic of common and repeated queries.

With YugabyteDB 2.13, materialized views recompute in the background when the base tables change. Therefore, any incremental data changes from the base tables are automatically added to the materialized views. Materialized views return fresh data, but, if changes to base tables might invalidate the materialized view, the data reads directly from the base tables. If changes to the base tables do not invalidate the materialized view, then the rest of the data reads from the materialized view and only the changes read from the base tables.

TPC-C performance update

For those new to TPC-C, this is an OLTP system benchmarking tool used to measure performance when handling transactions generated by a real-world OLTP application. It models a business that has a warehouse, multiple districts, and inventory for those warehouses, as well as items and orders for those items.

Of course, the number of warehouses is the key configurable parameter that determines the scale of running the benchmark. Increasing the number of warehouses increases the data set size, the number of concurrent clients, and the number of concurrently running transactions.

With YugabyteDB 2.13, the database can scale up to 1.27M tpmC with 150,000 warehouses, resulting in an efficiency score of 99.29%.

Change data capture (CDC)

Change data capture (CDC), introduced in YugabyteDB 2.13, allows multiple downstream applications and services to consume the continuous and never-ending stream of changes to Yugabyte databases. Streams scale to any YugabyteDB cluster, independent of its size. They also impact production traffic as little as possible.

Types of data changes captured include all of the row changes (i.e., inserts, updates, deletes). CDC also covers metadata changes such as the creation, modification, or removal of database objects, columns, and tables using the DDL.

Each CDC event is completely self-describing. This means that an event’s key and value each contain a payload with the actual information, a schema that fully describes the structure of the information, and the origin cluster information.

How does CDC provide consistency semantics? Here is how it’s implemented in YugabyteDB 2.13:

Per-tablet ordered delivery guarantee: All changes for rows in the same tablet process in the order in which they happened.
At-least once delivery: In the case of failures that lead to message loss or that take too long to recover, messages retransmit to assure at-least-once delivery.
No gaps in the change stream: There is a guarantee at all times that receiving any change implies all older changes have been received for a row.

Simplified application deployment

With YugabyteDB 2.13, developers have access to fully automated and integrated cloud-native development workflows. These workflows can be pre-configured with YugabyteDB using cloud-based development environments such as Gitpod and GitHub Codespaces. Both GitHub Codespaces and Gitpod workspaces can provision an instant development environment with a pre-configured YugabyteDB cluster.

New developer tools

YugabyteDB 2.13 provides support for MyBatis and Dapper ORM (object-relational mapping) tools. This allows developers to leverage new .NET and Java persistence frameworks to simplify building applications with YugabyteDB.

MyBatis: MyBatis is a first-class persistence framework with support for custom SQL, stored procedures, and advanced mappings. MyBatis eliminates almost all of the JDBC code and manual setting of parameters and retrieval of results.
Dapper: Dapper is an object–relational mapping product for the Microsoft .NET platform. It provides a framework for mapping an object-oriented domain model to a traditional relational database. Its purpose is to relieve the developer from a significant portion of relational data persistence-related programming tasks.

SOC 2 Type 1 compliance

Yugabyte has successfully completed a System and Organization Controls (SOC) 2 examination in accordance with the American Institute of Certified Public Accountants (AICPA) Trust Services Criteria for Security, Availability, Processing Integrity, Confidentiality, and Privacy.

Specifically, this accreditation confirms Yugabyte’s commitment to providing detailed information and assurance about security controls as they relate to our SaaS system.

Security partnerships

YugabyteDB 2.13 includes the enhanced security and improved manageability capabilities built through Yugabyte’s deep partnerships. These new partnerships include:

HashiCorp Vault: Use industry-favorite Hashicorp Vault with YugabyteDB to enjoy a centralized, cloud-agnostic key management system (KMS) with secure access to secrets.
Imperva Cloud Data Protection: Utilize out-of-the-box support to simplify monitoring and tracking of data in YugabyteDB for audits and vulnerability detection.

YugabyteDB for geo-distributed workloads

Yugabyte’s mission is to deliver the most developer-friendly distributed SQL database. The release of YugabyteDB 2.13 now allows for simplified coding by offloading and automating key functions at the data layer. It improves the developer experience by delivering easy, interactive training and greater access to preferred developer tools.

Many of the world’s largest Fortune 500 companies, including Kroger and General Motors, use YugabyteDB for database modernization, cloud-native applications, and geo-distributed workloads. The improvements introduced in YugabyteDB 2.13 allow the database to deliver critical business outcomes faster, while more quickly reacting to external and internal changes. YugabyteDB helps organizations become truly data driven by removing the tradeoffs found in legacy databases. This means organizations can instead prioritize innovation and improved customer experience.

Karthik Ranganathan is the co-founder and CTO at Yugabyte, the company behind YugabyteDB, a transactional distributed SQL database for cloud-native applications. Ranganathan received his BS and MS in CS from IIT-M and UT Austin. Ranganathan was one of the original database engineers at Facebook responsible for building distributed databases such as Cassandra and HBase. He is an Apache HBase committer and was an early contributor to Cassandra, before it was open-sourced by Facebook.

—

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Next read this: