- Home
- Tag: statistics
Posts tagged statistics
Tag: statistics
MySQL 5.7: Improved JOIN Order by Taking Condition Filter Effect into Account

Feed: Planet MySQL. Author: Øystein Grøvlen. One of the major challenges of query optimizers is to correctly estimate how many rows qualify from each table of a join. If the estimates are wrong, the optimizer may choose a non-optimal join order. Before MySQL 5.7, the estimated number of rows from a table only took into account the conditions from the WHERE clause that were used to set up the access method (e.g., the size of an index range scan). This often led to row estimates that were far too high, resulting in very wrong cost estimates for join plans. To ... Read More
Monitoring Databases: A Product Comparison

Feed: Planet MySQL. Author: MySQL Performance Blog. Manjot Singh | March 16, 2017 | Posted In: Database Monitoring, MariaDB, MySQL, Percona Monitoring and Management, Percona Monitoring Plugins PREVIOUS POST In this blog post, I will discuss the solutions for monitoring databases (which includes alerting) I have worked with and recommended in the past to my clients. This survey will mostly focus on MySQL solutions. One of the most common issues I come across when working with clients is monitoring and alerting. Many times, companies will fall into one of these categories: No monitoring or alerting. This means they have no idea what’s ... Read More
Ensuring Cloud Success – Q&A with Cloudera’s Charles Zedlewski

Feed: Database Trends and Applications : All Articles. The rise of big data and the growing popularity of cloud is a combination that presents valuable new opportunities to leverage data with greater efficiency. But organizations also need to be aware of some key differences between on-premise and cloud deployments, says Charles Zedlewski, senior vice president, products, at Cloudera, which provides a data management and analytics platform built on Hadoop and open source technologies. Key things organizations should focus on to ensure success in the cloud are a greater degree of convenience to users and a smaller cost footprint, he advises.How ... Read More
A formal spec for GitHub Flavored Markdown

Feed: Planet MySQL. Author: GitHub Engineering. We are glad we chose Markdown as the markup language for user content at GitHub. It provides a powerful yet straightforward way for users (both technical and non-technical) to write plain text documents that can be rendered richly as HTML. Its main limitation, however, is the lack of standarization on the most ambiguous details of the language. Things like how many spaces are needed to indent a line, how many empty lines you need to break between different elements, and a plethora of other trivial corner cases change between implementations: very similar looking Markdown ... Read More
The Value of Exploratory Data Analysis – Silicon Valley Data Science

Feed: Planet big data. Author: Meg Blanchette. And why you should care | March 9th, 2017 Editor’s note: Chloe (as well as other members of SVDS) will be speaking at TDWI Accelerate in Boston. Find more information, and sign up to receive our slides here. From the outside, data science is often thought to consist wholly of advanced statistical and machine learning techniques. However, there is another key component to any data science endeavor that is often undervalued or forgotten: exploratory data analysis (EDA). At a high level, EDA is the practice of using visual and quantitative methods to understand ... Read More
23 Great Blogs Posted in the last 12 Months
Feed: Featured Blog Posts - Data Science Central. Author: Vincent Granville. This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The previous digest in this series was posted here a while back. 23 Great Blogs Posted in the last 12 Months The New Rules for Becoming a Data Scientist Learning R in Seven Simple Steps 12 Python Resources for Data Science 12 Statistical and Machine Learning Methods that Every Data Scientist ... Read More
Planet scale aggregates with Azure DocumentDB

Feed: Microsoft Azure Blog. Author: Aravind Ramachandran. We’re excited to announce that we have expanded the SQL grammar in DocumentDB to support aggregate functions with the last service update. Support for aggregates is the most requested feature on the user voice site, so we are thrilled to roll this out everyone that's voted for it. Azure DocumentDB is a fully managed NoSQL database service built for fast and predictable performance, high availability, elastic scaling, global distribution, and ease of development. DocumentDB provides rich and familiar SQL query capabilities with consistent low latencies on JSON data. These unique benefits make DocumentDB ... Read More
How to think like a data scientist to become one
Feed: Featured Blog Posts - Data Science Central. Author: Karolis Urbonas. We have all read the punchlines – data scientist is the sexiest job, there’s not enough of them and the salaries are very high. The role has been sold so well that the number of data science courses and college programs are growing like crazy. After my previous blog post I have received questions from people asking how to become a data scientist – which courses are the best, what steps to take, what is the fastest way to land a data science job? I tried to really think it ... Read More
12 Great Curated Blogs About Data Science
Feed: Featured Blog Posts - Data Science Central. Author: Vincent Granville. The following articles were recently hand-picked, and curated by one of our interns, Emmanuelle. They quickly become popular, and cover dozens of topics of interest to data scientists. This is just a small, random selection among dozens of highly popular curated blogs on DSC. We plan to publish a similar listing of different curated blogs, bi-monthly. Authors (data scientists) are welcome to contact us for inclusion in our series, or for a non-commercial featured guest blog post, or to just blog directly on DSC. Precision vs significance / accuracy ... Read More
In case you missed it: February 2017 roundup

Feed: Planet big data. Author: David Smith. In case you missed them, here are some articles from February of particular interest to R users. Public policy researchers use R to predict neighbourhoods in US cities subject to gentrification. The ggraph package provides a grammar-of-graphics framework for visualizing directed and undirected graphs. Facebook has open-sourced the "prophet" package they use for forecasting time series at scale. A preview of features coming soon to R Tools for Visual Studio 1.0. On the differences between using Excel and R for data analysis. A data scientist suggests a "Gloom Index" for identifying the most ... Read More
Recent Comments