In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
Spilo is a tool that provides high availability for PostgreSQL databases running on AWS. It uses Patroni and ETCD to handle replication, failover, and cluster state management. Teams at Zalando use Spilo to run over 150 PostgreSQL databases in a self-managed way on AWS, with each team responsible for their own databases. Spilo provides automation for deploying, replicating, and failing over PostgreSQL clusters on AWS, allowing for increased agility compared to managed database services.
This document provides an overview and deep dive into Robinhood's RDS Data Lake architecture for ingesting data from their RDS databases into an S3 data lake. It discusses their prior daily snapshotting approach, and how they implemented a faster change data capture pipeline using Debezium to capture database changes and ingest them incrementally into a Hudi data lake. It also covers lessons learned around change data capture setup and configuration, initial table bootstrapping, data serialization formats, and scaling the ingestion process. Future work areas discussed include orchestrating thousands of pipelines and improving downstream query performance.
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Scalar, Inc.
?
The document presents advancements in making Cassandra more efficient and reliable through scalar DB, which adds ACID transaction support without modifying Cassandra's core. It discusses new methods for commit log synchronization to enhance durability and performance, particularly with the introduction of a group commit log sync mode. The performance and scalability of scalar DB on Cassandra are benchmarked, highlighting significant improvements in transaction management and reliability verified through extensive testing.
29回勉強会資料「PostgreSQLのリカバリ超入門」
See also http://www.interdb.jp/pgsql (Coming soon!)
初心者向け。PostgreSQLのWAL、CHECKPOINT、 オンラインバックアップの仕組み解説。
これを見たら、次は→ http://www.slideshare.net/satock/29shikumi-backup
The document provides an overview of Scalar DL, a tamper-evident and scalable database system. Some key points:
1. Scalar DL allows detection of Byzantine faults as long as the number of administrative domains exceeds the number of faulty domains. It can linearly scale performance and availability.
2. The system is database-agnostic and cloud-agnostic, supporting databases like Cassandra and clouds like AWS.
3. The architecture uses ledgers, ordering, and auditors. Ordering extracts parallelism while maintaining determinism. Auditors verify proofs of execution without trusting the ledger.
4. Benchmark results show Scalar DL outperforms Hyperledger Fabric for the Smallbank workload on Amazon
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Scalar, Inc.
?
The document presents advancements in making Cassandra more efficient and reliable through scalar DB, which adds ACID transaction support without modifying Cassandra's core. It discusses new methods for commit log synchronization to enhance durability and performance, particularly with the introduction of a group commit log sync mode. The performance and scalability of scalar DB on Cassandra are benchmarked, highlighting significant improvements in transaction management and reliability verified through extensive testing.
29回勉強会資料「PostgreSQLのリカバリ超入門」
See also http://www.interdb.jp/pgsql (Coming soon!)
初心者向け。PostgreSQLのWAL、CHECKPOINT、 オンラインバックアップの仕組み解説。
これを見たら、次は→ http://www.slideshare.net/satock/29shikumi-backup
The document provides an overview of Scalar DL, a tamper-evident and scalable database system. Some key points:
1. Scalar DL allows detection of Byzantine faults as long as the number of administrative domains exceeds the number of faulty domains. It can linearly scale performance and availability.
2. The system is database-agnostic and cloud-agnostic, supporting databases like Cassandra and clouds like AWS.
3. The architecture uses ledgers, ordering, and auditors. Ordering extracts parallelism while maintaining determinism. Auditors verify proofs of execution without trusting the ledger.
4. Benchmark results show Scalar DL outperforms Hyperledger Fabric for the Smallbank workload on Amazon