- Home
- Tag: metadata
Posts tagged metadata
Information about other data or objects. In Amazon Simple Storage Service (Amazon S3) and Amazon EMR (Amazon EMR) metadata takes the form of name–value pairs that describe the object. These include default metadata such as the date last modified and standard HTTP metadata such as Content-Type. Users can also specify custom metadata at the time they store an object. In Amazon Elastic Compute Cloud (Amazon EC2) metadata includes data about an EC2 instance that the instance can retrieve to determine things about itself, such as the instance type, the IP address, and so on.
Tag: metadata
IPv6 endpoints are now available for the Amazon EC2 Instance Metadata Service, Amazon Time Sync Service, and Amazon VPC DNS Server
Feed: Recent Announcements. The Amazon EC2 Instance Metadata Service, Amazon Time Sync Service, and Amazon VPC DNS server can now be accessed over IPv6 endpoints by instances built on the Nitro System. These local instance services have IPv6 addresses that can be accessed from your Amazon EC2 instances. These IPv6 endpoints use Unique Local Addresses (ULA); IPv6 for local instance services is useful for running software and containers in an IPv6-only single stack configuration. Additionally, if you are starting your transition to IPv6 in a dual-stack environment, the endpoints for the Instance Metadata Service, Amazon Time Sync Service, and Amazon ... Read More
Best Practices for Metadata Management

Feed: Alation. Author: Jason Rushin. July 19, 2021 — What Is Metadata? Metadata is information about data. A clothing catalog or dictionary are both examples of metadata repositories. Indeed, a popular online catalog, like Amazon, offers rich metadata around products to guide shoppers: ratings, reviews, and product details are all examples of metadata. An asset alone is just the tip of the iceberg; metadata tells you “what lies beneath.” Folks who work closely with data, like analysts, data scientists, and IT teams, rely on metadata to give them crucial context for how to use a given asset. Today, metadata is ... Read More
AWS Elemental MediaPackage extends its metadata passthrough capabilities
Feed: Recent Announcements. AWS Elemental MediaPackage now supports timed ID3 metadata passthrough for live and VOD streams in HLS, CMAF, and DASH formats. ID3 metadata tags enable data to be embedded into video streams at specified timecodes and used by downstream systems or clients to enhance the playback experience. By dynamically adding metadata to a stream, you can enable use cases such as ad-insertion beaconing, client-side dynamic graphical overlays, content chapters, and audio track listing. In addition, MediaPackage now also supports Key Length Value (KLV) metadata for DASH live streams, as per the latest MISB ST1910.1 specification. Similar to timed ... Read More
Use the Metadata API to Track SQL Executed Against Snowflake (Shared Job)

Feed: Matillion. Author: Julie Polito; The recent release of Matillion ETL version 1.54 introduced a new API, the Metadata API, to our Snowflake and Redshift enterprise offerings. What can you do with this new API? For a whole number of reasons (including data governance, audit logging, or just wanting so see what keeps changing your data in a table) you might be looking to answer questions such as: What SQL statements has Matillion executed against Snowflake today? Which Matillion job(s) have appended data to my table? What data is being pulled from my Salesforce system into Snowflake? Using the steps below ... Read More
Quick Hit: Processing macOS Application Metadata Weirdly Fast with mdls and R
Feed: R-bloggers. Author: hrbrmstr. [This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. (reminder: Quick Hits have minimal explanatory blathering, but I can elaborate on anything if folks submit a comment). I’m playing around with Screen Time on xOS again and noticed mdls (macOS command line utility for getting file metadata) has a -plist option (it probably has for a while & I just never ... Read More
Replay the Execution of MySQL With RR (Record and Replay)

Feed: Planet MySQL; Author: Marcelo Altmann; Chasing bugs can be a tedious task, and multi-threaded software doesn’t make it any easier. Threads will be scheduled at different times, instructions will not have deterministic results, and in order for one to reproduce a particular issue, it might require the exact same threads, doing the exact same work, at the exact same time. As you can imagine, this is not straightforward. Let’s say your database is crashing or even having a transient stall. By the time you get to it, the crash has happened and you are stuck restoring service quickly and ... Read More
Build Slowly Changing Dimensions Type 2 (SCD2) with Apache Spark and Apache Hudi on Amazon EMR

Feed: AWS Big Data Blog. Organizations across the globe are striving to improve the scalability and cost efficiency of the data warehouse. Offloading data and data processing from a data warehouse to a data lake empowers companies to introduce new use cases like ad hoc data analysis and AI and machine learning (ML), reusing the same data stored on Amazon Simple Storage Service (Amazon S3). This approach avoids data silos and allows you to process the data at very large scale while keeping the data access cost-effective. Starting off with this new approach can bring with it several challenges: Choosing ... Read More
5 perspectives on modern data analytics
Feed: CIO. Author: . Some things don't change, even during a pandemic. Consistent with previous years, in CIO’s 2021 State of the CIO survey, a plurality of the 1,062 IT leaders surveyed chose “data/business analytics” as the No.1 tech initiative expected to drive IT investment.Unfortunately, analytics initiatives seldom do nearly as well when it comes to stakeholder satisfaction.Last year, CIO contributor Mary K. Pratt offered an excellent analysis of why data analytics initiatives still fail, including poor-quality or siloed data, vague rather than targeted business objectives, and clunky one-size-fits-all feature sets. But a number of fresh approaches and technologies are making ... Read More
Using Github Actions & drat to Deploy R Packages
Feed: R-bloggers. Author: R on Chemometrics & Spectroscopy using R. Last summer, a GSOC project was approved for work on the hyperSpec package which had grown quite large and hard to maintain. The essence of the project was to break the original hyperSpec package into smaller packages. As part of that project, we needed to be able to: Provide development versions of packages Provide large data-only packages (potentially too large to be hosted on CRAN). In this post I’ll describe how we used Dirk Eddelbuettel’s drat package and Github Actions to automate the deployment of packages between repositories. What is ... Read More
Using Kubernetes and the Future Package to Easily Parallelize R in the Cloud
Feed: R-bloggers. Author: JottR on R. [This article was first published on JottR on R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This is a guest post by Chris Paciorek, Department of Statistics, University of California at Berkeley. In this post, I’ll demonstrate that you can easily use the future package in R on a cluster of machines running in the cloud, specifically on a Kubernetes cluster. This allows you to easily ... Read More
Recent Comments