- Home
- Tag: HBase
Posts tagged HBase
Tag: HBase
HBase to CDP Operational Database Migration Overview

Feed: Cloudera Blog. Author: Liliana Kadar. Posted in Technical | February 04, 2022 2 min read This blog post provides an overview of the HBase to CDP Operational Database (COD) migration process. CDP Operational Database enables developers to quickly build future-proof applications that are architected to handle data evolution. It helps developers automate and simplify database management with capabilities like auto-scale and is fully integrated with Cloudera Data Platform (CDP). For more information and get started with COD, refer to Getting Started with Cloudera Data Platform Operational Database (COD). You can now migrate your existing HBase workloads to COD* with ... Read More
Stream Apache HBase edits for real-time analytics

Feed: AWS Big Data Blog. Apache HBase is a non-relational database. To use the data, applications need to query the database to pull the data and changes from tables. In this post, we introduce a mechanism to stream Apache HBase edits into streaming services such as Apache Kafka or Amazon Kinesis Data Streams. In this approach, changes to data are pushed and queued into a streaming platform such as Kafka or Kinesis Data Streams for real-time processing, using a custom Apache HBase replication endpoint. We start with a brief technical background on HBase replication and review a use case in ... Read More
Delivering High Performance for Cloudera Data Platform Operational Database (HBase) When Using S3

Feed: Cloudera Blog. Author: Ankit Singhal. Posted in Technical | December 08, 2021 7 min read CDP Operational Database (COD) is a real-time auto-scaling operational database powered by Apache HBase and Apache Phoenix. It is one of the main Data Services that runs on Cloudera Data Platform (CDP) Public Cloud. You can access COD right from your CDP console. With COD, application developers can now leverage the power of HBase and Phoenix without the overheads related to deployment and management. COD is easy-to-provision and is autonomous, that means developers can provision a new database instance within minutes and start creating ... Read More
Value Proposition of the Cloudera Operational Database over Legacy Apache HBase Deployments

Feed: Cloudera Blog. Author: Andreas Skouloudis. Posted in Business | September 09, 2021 12 min read The CDP Operational Database (COD) builds on the foundation of existing operational database capabilities that were available with Apache HBase and/or Apache Phoenix in legacy CDH and HDP deployments. Within the context of a broader data and analytics platform implemented in the Cloudera Data Platform (CDP), COD will function as highly scalable relational and non-relational transactional database allowing users to leverage big data in operational applications as well as the backbone of the analytical ecosystem, being leveraged by other CDP experiences (e.g., Cloudera Machine ... Read More
Cloudera’s HBase PaaS offering now supports Complex Transactions

Feed: DB-Engines.com Blog. Author: Krishna Maheshwari. Blog > Postby , 11 August 2021Tags: Apache Phoenix, DBaaS, HBaseCloudera announced the general availability of the CDP Operational Database Experience (COD) on both AWS & Azure in February and recently announced support of full ACID capabilities within the relational mode of the Operational Database. CDP Operational Database is a fully managed cloud-native transactional database with unparalleled scale, performance, and reliability. Optimized to be deployed anywhere, on any cloud platform, CDP Operational Database aligns with the cloud infrastructure strategy best suited for the business. It enables application developers to deliver prototypes in under an ... Read More
Real-Time Big Data Analytics: How to Replicate from MySQL to Hadoop

Feed: Planet MySQL; Author: Continuent; First off: Happy 15th birthday, Hadoop! It wasn’t an April Fool’s joke then, and it isn’t today either: Hadoop’s initial release was on the 1st of April 2006 :-) As most of you will know, Apaches Hadoop is a powerful and popular tool, which has been driving much of the Big Data movement over the years. It is generally understood to be a system that provides a (distributed) file system, which in turn stores data to be used by applications without knowing about the structure of the data. In other words, it’s a file system ... Read More
Emil Shkolnik: Is Greenplum Database “just a big sharded PostgreSQL”?

Feed: Planet PostgreSQL. 29 Mar Is Greenplum Database “just a big sharded PostgreSQL”? Post Views: 641IntroductionWhat is Greenplum Database? This is on of PostgreSQL forks optimized for OLAP and analytics workloads. In my opinion the second life of GreenplumDB began in 2015 year. In this year Greenplum became the open source project. The current 6 version based on PostgreSQL 9.4, and the Greenplum Community is actively developing the 7 version, which should be compatible with PostgreSQL 13! So, this is really cool! But what prompted me to write this article? The fact is that sometimes we are confronted with the opinion ... Read More
Amazon EMR 6.2.0 adds persistent HFile tracking to improve performance with HBase on Amazon S3

Feed: AWS Big Data Blog. Apache HBase is an open-source, NoSQL database that you can use to achieve low latency random access to billions of rows. Starting with Amazon EMR 5.2.0, you can enable HBase on Amazon Simple Storage Service (Amazon S3). With HBase on Amazon S3, the HBase data files (HFiles) are written to Amazon S3, enabling data lake architecture benefits such as the ability to scale storage and compute requirements separately. Amazon S3 also provides higher durability and availability than the default HDFS storage layer. When using Amazon EMR 5.7.0 or later, you can set up a read ... Read More
Amazon EMR 2020 year in review

Feed: AWS Big Data Blog. Tens of thousands of customers use Amazon EMR to run big data analytics applications on Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto at scale. Amazon EMR automates the provisioning and scaling of these frameworks, and delivers high performance at low cost with optimized runtimes and support for a wide range of Amazon Elastic Compute Cloud (Amazon EC2) instance types and Amazon Elastic Kubernetes Service (Amazon EKS) clusters. Amazon EMR makes it easy for data engineers and data scientists to develop, visualize, and debug data science applications with Amazon EMR Studio ... Read More
Introducing Amazon EMR integration with Apache Ranger

Feed: AWS Big Data Blog. Data security is an important pillar in data governance. It includes authentication, authorization , encryption and audit. Amazon EMR enables you to set up and run clusters of Amazon Elastic Compute Cloud (Amazon EC2) instances with open-source big data applications like Apache Spark, Apache Hive, Apache Flink, and Presto. You may also want to set up multi-tenant EMR clusters where different users (or teams) can use a shared EMR cluster to run big data analytics workloads. In a multi-tenant cluster, it becomes important to set up mechanisms for authentication (determine who is invoking the application and authenticate the ... Read More
Recent Comments