- Home
- Tag: streaming
Posts tagged streaming
Amazon EMR (Amazon EMR): A utility that comes with Hadoop that enables you to develop MapReduce executables in languages other than Java. Amazon CloudFront: The ability to use a media file in real time—as it is transmitted in a steady stream from a server.
Tag: streaming
Simon Riggs: Thoughts on Uber’s List of Postgres Limitations
← PostgreSQL Solutions Roadmap PostgreSQL 10 Roadmap → An Uber technical blog of July 2016 described the perception of “many Postgres limitations”. Regrettably, a number of important technical points are either not correct or not wholly correct because they overlook many optimizations in PostgreSQL that were added specifically to address the cases discussed. In most cases, those limitations were actually true in the distant past of 5-10 years ago, so that leaves us with the impression of comparing MySQL as it is now with PostgreSQL as it was a decade ago. This is no doubt because the post was actually ... Read More
Streaming Analytics: A story of many tales

Streaming Analytics is a data processing paradigm which is gaining much traction lately, mainly because more and more data is available as events through web services and real-time sources rather than being collected and packaged in data batches. Over the past years, we have seen a number of projects which are tackling this problem. Some are stemming from batch analysis and achieve streaming computing by scheduling data processing at a high rate (micro-batching). Other projects started by dealing with event processing and moved up to process bigger data collections. Streaming analytics can be applied to both relational, NoSQL and document oriented data stores. The underlying ... Read More
Building a scalable platform for streaming updates and analytics

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In this episode of the O’Reilly Data Show, I sit down with Evan Chan, distinguished engineer at Tuplejump. We talk about the early days of Spark (particularly his contributions to Spark/Cassandra integration), his interesting new open source project (FiloDB), and recent trends in cloud computing. Bringing Apache Spark & Apache Cassandra together Datastax credits me with inspiring them to bring Spark into Cassandra … I think they’re very generous about that. I think I was one of the first folks to ... Read More
Building systems for massive scale data applications

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. Many of the open source systems and projects we’ve come to love — including Hadoop and HBase — were inspired by systems used internally within Google. These systems were described in papers and implemented by people who needed frameworks that could comfortably scale to massive data sets. Google engineers and scientists continue to publish interesting papers, and these days some of the big data systems they describe in publications are available on their cloud platform. In this episode of the O’Reilly ... Read More
Operationalize your machine learning project using SQL Server 2016 SSIS and R Services

With the release of CTP3 SQL Server 2016 and its native In-database support for the open source R language (SQL Server R Services), users can now call both R and RevoScaleR functions and scripts directly from within a SQL query and benefit from multi-threaded and multi-core in-DB computations. The R integration brings the utility of data science to your applications without the need to ‘export’ the data to your R environment. Moreover, users can now use SQL Server Integration Services (SSIS) to: – extract data from various on-premises and/or cloud sources to build training data – extract data from various ... Read More
Writing SQL on Streaming Data with Amazon Kinesis Analytics – Part 1

Feed: AWS Big Data Blog. Author: .
Ryan Nienhuis is a Senior Product Manager for Amazon Kinesis
This is the first of two AWS Big Data blog posts on Writing SQL on Streaming Data with Amazon Kinesis Analytics. In this post, I provide an overview of streaming data and key concepts like the basics of streaming SQL, and complete a walkthrough using a simple example. In the next post, I will cover more advanced stream processing concepts using Amazon Kinesis Analytics.
Most organizations use batch data processing to perform their analytics in daily or hourly intervals to inform ... Read More
Best practices for streaming applications

Article image: Frank Gehry's Dancing House windows. (source: Mounirzok on Wikimedia Commons). How Baidu combined Tachyon with Spark SQL to increase speed 30-fold The Lambda Architecture has its merits, but alternatives are worth exploring. What it looks like to analyze, visualize, and even forecast human society using global news coverage. Video play In this O'Reilly training video, the "Hadoop Application Architectures" authors present an end-to-end case study of a clickstream analytics engine to provide a concrete example of how to architect and implement a complete solution with Hadoop. In this segment, they provide an overview of the complete architecture. Presenters: ... Read More
Real-time in no time with Power BI

Earlier this year, we wrote about leveraging the Power BI REST APIs to create real-time dashboards in Power BI. Ever since, we’ve seen thousands of Power BI users engage with our real-time capabilities. Today, I am happy to announce the preview availability of functionality which makes it even easier to stream real-time data to Power BI, and to see that data light up in your dashboards. Designed for easy setup, these new features show our commitment towards allowing our users to quickly gain insights from data of all shapes, sizes and velocities.Real-time data empowers users to make quick decisions on time-sensitive ... Read More
Technology Integration: Not Just for Your Smartphones

With the increasing amount of information that we use daily, technology is only becoming more and more important in everything we do. And businesses are seeing this at much greater scale than we do as consumers. There are many great examples of this in just about every industry.Retailers, for example, now rely heavily on technology to improve sales. Stores now have access to consumer behavior data, helping them further tailor advertisements and products to customers. Many retailers use algorithms to analyze consumer data (i.e., recommendation engines) to determine what other items a customer might be interested in after they purchase ... Read More
List Of NoSQL Databases [currently >225]
Your Ultimate Guide to the Non-Relational Universe! [including a historic Archive 2009-2011] News Feed covering some changes here ! NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points : being non-relational, distributed, open-source and horizontally scalable . The original intention has been modern web-scale databases. The movement began early 2009 and is growing [...] ... Read More
Recent Comments