- Home
- Tag: Apache
Posts tagged Apache
Tag: Apache
Amazon Elasticsearch Service now supports rollups, reducing storage costs for extended retention
Feed: Recent Announcements. Amazon Elasticsearch Service introduces index rollups that lets you summarize high granularity data and preserve feature-rich aggregations over large data sets for analytics while reducing the storage costs. As time-series data grows to considerable sizes over time, it can slow down your aggregations and you may incur a substantial storage cost. Also, the usefulness of granular data reduces with time. Rollups lets you create a new index containing only relevant fields aggregated into coarser time buckets. With rolled up indexes, users can store up to years of data with reduced storage costs. Users can initiate rollups on ... Read More
Amazon Elasticsearch Service add support for Reporting in Kibana
Feed: Recent Announcements. Amazon Elasticsearch Service now supports Reporting, a new feature that enables Kibana users to generate and download reports. They can now generate reports directly from the Dashboard, Visualize and Discover panels, and export them to PDF, CSV and PNG file formats. Reporting is powered by Open Distro for Elasticsearch, an Apache 2.0-licensed distribution of Elasticsearch and is supported on all Amazon Elasticsearch Service domains running Elasticsearch 7.9 or greater. To learn more about Open Distro for Elasticsearch and Reporting, visit the project website and documentation. Reporting for Amazon Elasticsearch Service is available across 24 regions globally: US ... Read More
Retaining data streams up to one year with Amazon Kinesis Data Streams

Feed: AWS Big Data Blog. Streaming data is used extensively for use cases like sharing data between applications, streaming ETL (extract, transform, and load), real-time analytics, processing data from internet of things (IoT) devices, application monitoring, fraud detection, live leaderboards, and more. Typically, data streams are stored for short durations of time before being loaded into a permanent data store like a data lake or analytics service. Additional use cases are becoming more prevalent that may require you retain data in streams for longer periods of time. For example, compliance programs like HIPAA and FedRAMP may require you to store ... Read More
How the evolution of data analytics impacts the digital marketing industry
Feed: Big Data Made Simple. Author: Philip Piletic. The modern digital marketing industry simply couldn’t exist without the aggregation of huge amounts of data. That being said, the role that data plays in marketing has changed dramatically in the last few years. Some computer scientists are suggesting that many organizations that currently collect customer information will soon be unable to process the sheer amount of data they’re working with.Unfortunately, that means some companies have been reduced to guessing as opposed to actually using their data in a wise fashion. This, combined with the recent announcement that major marketing firms are ... Read More
Accelerating Department of Defense mission workloads with Azure
Feed: Microsoft Azure Blog. Author: Eric Brown. As the Azure engineering team continues to deliver a rapid pace of innovation for defense customers, we’re also continuing to support Department of Defense (DoD) customers and partners in delivering new capabilities to serve mission needs. In many cases, accelerating mission workloads means forging a faster and more secure way for teams to build, ship, and authorize new applications. For the broad range of suppliers providing goods and services to the DoD, including the Defense Industrial Base (DIB), this also means navigating evolving compliance requirements. Navigating the new Cybersecurity Maturity Model Certification (CMMC) ... Read More
Kustomize Best Practices
Feed: R-bloggers. Author: Open Analytics. [This article was first published on Open Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Introduction In recent years, Kubernetes has become a renowned solution for orchestrating cloud-independent infrastructure. Open Analytics supports the data analysis process end to end. This includes infrastructure that underpins the data science platforms we build. Since we exclusively work with open technology, it should come as no surprise that we adopted Kubernetes ... Read More
Amazon MSK backup for Archival, Replay, or Analytics

Feed: AWS Architecture Blog. Amazon MSK is a fully managed service that helps you build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes. You can also stream changes to and from databases, and power machine learning and analytics applications. Amazon MSK simplifies the setup, scaling, and management of clusters running Apache Kafka. MSK manages the provisioning, configuration, and maintenance of resources for a highly available Kafka clusters. It is fully ... Read More
Building an administrative console in Amazon QuickSight to analyze usage metrics

Feed: AWS Big Data Blog. Given the scalability of Amazon QuickSight to hundreds and thousands of users, a common use case is to monitor QuickSight group and user activities, analyze the utilization of dashboards, and identify usage patterns of an individual user and dashboard. With timely access to interactive usage metrics, business intelligence (BI) administrators and data team leads can efficiently plan for stakeholder engagement and dashboard improvements. For example, you can remove inactive authors to reduce license cost, as well as analyze dashboard popularity to understand user acceptance and stickiness. This post demonstrates how to build an administrative console ... Read More
First mlverse survey results – software, applications, and beyond
Feed: R-bloggers. Author: Sigrid Keydana. [This article was first published on RStudio AI Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Thank you everyone who participated in our first mlverse survey! Wait: What even is the mlverse? The mlverse originated as an abbreviation of multiverse, which, on its part, came into being as an intended allusion to the well-known tidyverse. As such, although mlverse software aims for seamless interoperability with the tidyverse, ... Read More
Apache Spark Connector for SQL Server and Azure SQL now compatible with Spark 3.0
Feed: Microsoft Azure Blog. Author: Rahul Ajmera. Accelerate big data analytics with the Spark 3.0 compatible connector for SQL Server—now in preview. We are announcing that the preview release of the Apache Spark 3.0 compatible Apache Spark Connector for SQL Server and Azure SQL, available through Maven. Open sourced in June 2020, the Apache Spark Connector for SQL Server is a high-performance connector that enables you to use transactional data in big data analytics and persist results for ad-hoc queries or reporting. It allows you to use SQL Server or Azure SQL as input data sources or output data sinks ... Read More
Recent Comments