Software

Databases

Tencent builds one NoSQL database to rule all data models

Tamed DB sprawl and saved cloudy resources with 'X-Stor'


Exclusive Chinese web giant Tencent has revealed it created a NoSQL database that it believes can handle multiple data models more elegantly than other attempts to do so, and has used it to consolidate its database fleet and improve resource utilization.

The existence of the database – named X-Stor – was recently revealed in a paper [PDF] published in the Proceedings of the Very Large Data Base Endowment, the journal of the non-profit organization that exists to promote and exchange scholarly work on databases and related fields.

The paper opens with observations that NoSQL databases are generally built to handle certain data models. Tencent admits it ran several of them to power its fleet of products – social networks, video streaming services, online games, and a public cloud – that collectively serve more than a billion active users.

Titled "X-Stor: A Cloud-native NoSQL Database Service with Multi-model Support", the paper reveals Tencent used graph databases to store info about user relationships for its social networks, wide-column stores to hold user profiles, document series databases to power its advertising operations, and time-series databases to record user behavior data.

That proved less than ideal because Tencent found it hard to support novel data models in existing systems – so sometimes needed to develop a new NoSQL system from scratch. Doing so meant rebuilding functions already found elsewhere – a wasteful overlap.

Like any hyperscaler, Tencent abhors under-used resources. The web giant was therefore not thrilled to learn that "deploying multiple heterogeneous databases at scale leads to system resources isolation for different NoSQL databases, which not only complicates maintenance but also hinders efficient resource sharing among clusters."

X-Stor addresses that issue – allowing the use of different data models by "extending the corresponding storage engine and data access interfaces within the X-Stor system." The independent storage engines "can fully support their respective data models, with performance comparable to that of their single-model counterparts."

The paper claims that's a more elegant arrangement than those used by rival NoSQL databases MongoDB, Redis, and ArangoDB, each of which has its own way of accommodating multiple data models.

X-Stor is serverless and runs as multiple microservices orchestrated by Tencent's own Kubernetes Engine. Tencent initially ran the database on hosts packed with fast SSDs to handle the needs of different data models, such as I/O-intensive key-value and time-series models. However, doing so saw under-utilization of memory in some SSD-equipped servers. X-Stor can identify which nodes have the resources needed to match a workload and the data model it employs, thus using each node to optimal extent.

Tencent's paper offers some dense math explaining how workloads compete for and are allocated resources – enjoy its equations if that's your thing.

The bottom line is that the Chinese giant built itself a database it claims can handle any data model – even entirely new ones – and which it has proven can scale to store 12PB for online operational data, 700 billion requests per day with a peak of 30 million requests per second, while handling more than 100,000 tables with multiple data models.

Sadly, it appears the database is not open source – so the rest of us can't take it for a spin.

China's hyperscalers are doing interesting things. We've recently reported Alibaba Cloud's hardware failure detection code, modular datacenter architecture, and an advanced Ethernet scheme that sees nine NICs installed in the servers it uses for AI model training. Huawei Cloud runs an advanced network health probe. Tencent found a way to halve WAN latency. ®

Send us news
8 Comments

FTC scolds two data brokers for allegedly selling your location to the meter

'Where we go is who we are' totally isn't a creepy ad slogan at all

India spending $170M to take its tax system paperless by rebuilding three legacy systems

Let's see how this goes

Database warhorse SQL Server 2025 goes all-in on AI

Better locking, improved query optimization, and... Copilot

Clock's ticking on PostgreSQL 12, but not everyone is ready to say goodbye

11% of databases still on aging version with a month of support left

Huawei releases data detailing serverless secrets

Reveals why your functions start slowly on its cloud and maybe others too

On-prem SaaS? ServiceNow will do it if you ask nicely, and really need it

Turns out its application can work with databases other than its own

OpenAI's rapid growth loaded with 'corner case' challenges, says Fivetran CEO

GenAI poster child is a 100-story-tall baby with simple infrastructure but extreme demands

WeChat devs introduced security flaws when they modded TLS, say researchers

No attacks possible, but enough issues to cause concern

Cockroach Labs CEO: Diverse database models are essential as app demands surge

Licensing mixes also needed lest vendors give too much away

National Public Data files for bankruptcy, admits 'hundreds of millions' potentially affected

One-man-band faces a mountain of lawsuits but has few assets

MongoDB rebuts claims it's not ready for business critical workloads

Shifting battle-hardened systems to document model – are your skills and tools ready?

The force is strong in Iceberg: Are the table format wars entering the final chapter?

Former Apple engineer and Apache PMC member Russell Spitzer describes efforts to unite around a single format