Examining the arc of 100,000 stories

Examining the arc of 100,000 stories

I recently came across a great natural language dataset from Mark Riedel: 112,000 plots of stories downloaded from English language Wikipedia. This includes books, movies, TV episodes, video games- anything that has a Plot section on a Wikipedia page.

This offers a great opportunity to analyze story structure quantitatively. In this post I’ll do a simple analysis, examining what words tend to occur at particular points within a story, including words that characterize the beginning, middle, or end.

Variance Explained → Examining the arc of 100,000 stories: a tidy analysis

Space, Time and Groceries

Space, Time and Groceries

At Instacart, we deliver a lot of groceries. By the end of next year, 80% of American households will be able to use Instacart. Our challenge: complete every delivery on-time, with the right groceries as fast as possible.
Over the course of a week, we traverse cities all over the United States many times over while delivering groceries:
How do we bring order to the chaos?
[...] we’ll first introduce the logistics problem Instacart is solving, outline the architecture of our systems and describe the GPS data we collect. Then we will conclude by touring a series of datashader visualizations:
Visualizations like these help us to build intuition about our system, generate hypotheses for improvements, sanity check our changes, identify best practices and improve our operations.

Instacart Tech → Space, Time and Groceries

Packaging Design by Algorithm

Packaging Design by Algorithm

Millions of Italians can now say they own a one-of-a-kind Nutella jar. In February, 7 million jars appeared on shelves in Italy, all of them boasting a unique label design. And here's a weird twist: Every single one of those millions of labels was designed by...an algorithm?
[...] this algorithm's output was millions upon millions of labels for real-life Nutella jars.

Inc. → Nutella 'Hired' an Algorithm to Design New Jars. And It Was a Sell-Out Success.

How Airbnb Democratizes Data Science

How Airbnb Democratizes Data Science

Data is essential to us at Airbnb. We characterize data as the voice of our users at scale. Thus, data science plays the role of an interpreter — we use data and statistics to understand our users and translate it to a voice that people or machines can understand. We leverage these quantitative insights, paired together with qualitative insights (e.g. in-person user research) to make the best possible decisions for both the business and our community of hosts and guests.

Airbnb Engineering & Data Science → How Airbnb Democratizes Data Science With Data University

Build a Better Monster: Morality, Machine Learning, and Mass Surveillance

Build a Better Monster: Morality, Machine Learning, and Mass Surveillance

Today the technology that ran that arcade game permeates every aspect of our lives. We’re here at an emerging technology conference to celebrate it, and find out what exciting things will come next. But like the tail follows the dog, ethical concerns about how technology affects who we are as human beings, and how we live together in society, follow us into this golden future. No matter how fast we run, we can’t shake them.

This year especially there’s an uncomfortable feeling in the tech industry that we did something wrong, that in following our credo of “move fast and break things”, some of what we knocked down were the load-bearing walls of our democracy.

Maciej Cegłowski → Build a Better Monster: Morality, Machine Learning, and Mass Surveillance

The Myth of a Superhuman AI

The Myth of a Superhuman AI

I've heard that in the future computerized AIs will become so much smarter than us that they will take all our jobs and resources, and humans will go extinct. Is this true?
That’s the most common question I get whenever I give a talk about AI. The questioners are earnest; their worry stems in part from some experts who are asking themselves the same thing. These folks are some of the smartest people alive today, such as Stephen Hawking, Elon Musk, Max Tegmark, Sam Harris, and Bill Gates, and they believe this scenario very likely could be true. Recently at a conference convened to discuss these AI issues, a panel of nine of the most informed gurus on AI all agreed this superhuman intelligence was inevitable and not far away.

Backchannel → The Myth of a Superhuman AI

A list of A.I. tools you can use today — for personal use

A list of A.I. tools you can use today — for personal use

Artificial Intelligence and the fourth industrial revolution has made some considerable progress over the last couple of years. Most of this current progress that is usable has been developed for industry and business purposes, as you’ll see in coming posts. Research institutes and dedicated, specialised companies are working toward the ultimate goal of AI (cracking artificial general intelligence), developing open platforms and the looking into the ethics that follow suit. There are also a good handful of companies working on AI products for consumers

Hackernoon → A list of artificial intelligence tools you can use today — for personal use

How the United Nations built (and measured) its data marketplace

How the United Nations built (and measured) its data marketplace

Used by people in over 200 countries and territories around the world, HDX has become the platform the UN, NGOs, governments and humanitarian actors can depend on when coordinating data-driven relief efforts. In fact, the United Nations (along with the New York Times and The Economist) relied on HDX as the common platform for data during the Ebola Crisis.

 The Signal → How the United Nations built (and measured) its data marketplace

How Big Data Helps Today’s Airlines Operate

How Big Data Helps Today’s Airlines Operate

If you can relate to the stress of rushing to the airport, long security lines leading to crowded terminals, boarding passes and IDs, checking and stowing bags, cramped compartments filled with travel weary passengers you probably consider yourself an experienced airline passenger. However, knowing the ins and outs of flying as a passenger doesn’t give the average person insight into the complicated operations side of airlines today. The intense competition within the airline industry leads to innovation as companies seek to save and make money and increase efficiency, with a recent focus on the advantages big data provides.

Examples of airlines creatively using big data to improve performance abounds. United Airlines shifted focus in 2014 and began using the mantra of “collect, detect and analyze” data and saw a 15 percent year-over-year revenue increase in their online sales after offering customers a tailored, big data driven shopping experience. Delta invested in baggage tracking data and then created a baggage tracking app for customers that has been downloaded over 11 million times. Southwest started using a big data platform tracking their Boeing planes’ fuel usage trends, which is saving the airline millions of dollars annually. Japan Airlines recently launched a data collection system that measures temperature on airplane components with IBM Japan. The idea is to collect enough data to predict technical problems and prevent costly flight cancellations.

KDnuggets → How Big Data Helps Today’s Airlines Operate

Forget Coding: Writing Is Design’s “Unicorn Skill”

Forget Coding: Writing Is Design’s “Unicorn Skill”

These days many designers can code–an increasingly important skill for landing a job. But few are just as fluent in their own language as they are in Javascript. That presents a serious problem in terms of design. Users still depend on copy to interact with apps and other products. If designers don’t know how to write well, the final product–be it a physical or digital one–can suffer as a result.

In his “2017 Design in Tech Report,” John Maeda writes that “code is not the only unicorn skill.” According to Maeda, who is the head of computational design and inclusion at Automattic and former VP of design at VC firm Kleiner Perkins, words can be just as powerful as the graphics in which designers normally traffic. “A lot of times designers don’t know that words are important,” he said while presenting the report at SXSW this weekend. “I know a few designers like that–do you know these designers out there? You do know them, right?”

Co.Design → Forget Coding: Writing Is Design’s “Unicorn Skill”

Should we be afraid of AI?

Should we be afraid of AI?

Suppose you enter a dark room in an unknown building. You might panic about monsters that could be lurking in the dark. Or you could just turn on the light, to avoid bumping into furniture. The dark room is the future of artificial intelligence (AI). Unfortunately, many people believe that, as we step into the room, we might run into some evil, ultra-intelligent machines. 

Aeon → Should we be afraid of AI?

Are we going from "Artificial Intelligence" to "Augmented Intelligence?"

Are we going from "Artificial Intelligence" to "Augmented Intelligence?"

AI is going to augment natural human intelligence and enable people to gain the world’s collective expertise while requiring less time and study than what has been required to become an expert in any one thing today. Traditionally in humans, an expert’s mind possesses fewer possibilities for slower growth, while a beginners mind offers many possibilities for rapid growth.

Scott Noteboom → Are we going from "Artificial Intelligence" to "Augmented Intelligence?"

Regulating the internet giants

Regulating the internet giants

The world’s most valuable resource is no longer oil, but data.
A new commodity spawns a lucrative, fast-growing industry, prompting antitrust regulators to step in to restrain those who control its flow. A century ago, the resource in question was oil. Now similar concerns are being raised by the giants that deal in data, the oil of the digital era. These titans—Alphabet (Google’s parent company), Amazon, Apple, Facebook and Microsoft—look unstoppable. They are the five most valuable listed firms in the world. Their profits are surging: they collectively racked up over $25bn in net profit in the first quarter of 2017. Amazon captures half of all dollars spent online in America. Google and Facebook accounted for almost all the revenue growth in digital advertising in America last year.

Such dominance has prompted calls for the tech giants to be broken up, as Standard Oil was in the early 20th century. This newspaper has argued against such drastic action in the past. Size alone is not a crime. The giants’ success has benefited consumers. Few want to live without Google’s search engine, Amazon’s one-day delivery or Facebook’s newsfeed. Nor do these firms raise the alarm when standard antitrust tests are applied. Far from gouging consumers, many of their services are free (users pay, in effect, by handing over yet more data). Take account of offline rivals, and their market shares look less worrying. And the emergence of upstarts like Snapchat suggests that new entrants can still make waves.

But there is cause for concern. Internet companies’ control of data gives them enormous power. Old ways of thinking about competition, devised in the era of oil, look outdated in what has come to be called the “data economy”. A new approach is needed.

The Economist → The world’s most valuable resource is no longer oil, but data

Top mistakes data scientists make

Top mistakes data scientists make

The rise of the data scientists continues and the social media is filled with success stories – but what about those who fail? There are no cover articles praising the fails of the many data scientists that don’t live up to the hype and don’t meet the needs of their stakeholders.

The job of the data scientist is solving problems. And some data scientists can’t solve them. They either don’t know how to, or are obsessed about the technology part of the craft and forget what the job is all about. Some get frustrated that “those business people” are asking them to do “simple trivial data tasks” while they’re working on something “really important and complex”. There are many ways a data scientist can fail – here’s a summary of top three mistakes that is a straight path towards failure.

Cyborgus → Top mistakes data scientists make

The High-Speed Trading Behind Your Amazon Purchase

The High-Speed Trading Behind Your Amazon Purchase

Amazon gave people and companies the ability to sell on Amazon.com in 2000, and it has since grown into a juggernaut, representing 49% of the goods Amazon ships. Amazon doesn't break out numbers for the portion of its business driven by independent sellers, but that translates to tens of billions in revenue a year. Out of more than 2 million registered sellers, 100,000 each sold more than $100,000 in goods in the past year, Peter Faricy, Amazon's vice president in charge of the division that includes outside sellers, said at a conference last week.

It's clear, after talking to sellers and the software companies that empower them, that the biggest of these vendors are growing into sophisticated retailers in their own right. The top few hundred use pricing algorithms to battle with one another for the coveted "Buy Box," which designates the default seller of an item. It's the Amazon equivalent of a No. 1 ranking on Google search, and a tremendous driver of sales.

Dow Jones → The High-Speed Trading Behind Your Amazon Purchase

Netflix’s Grand, Daring, Maybe Crazy Plan to Conquer the World

Netflix’s Grand, Daring, Maybe Crazy Plan to Conquer the World

Netflix is a notoriously data-driven company, and the Daredevil header art test is one of hundreds it will conduct this year. That data trove has also enabled Netflix’s gamble on global expansion, by illuminating one simple fact: People are all different, but not in the ways you’d imagine.

“There’s a mountain of data that we have at our disposal,” says Todd Yellin, Netflix’s VP of product innovation. Netflix has a well-earned reputation for using the information it gleans about its customers to drive everything from the look of the service to the shows in which it invests. “That mountain is composed of two things. Garbage is 99 percent of that mountain. Gold is one percent… . Geography, age, and gender? We put that in the garbage heap. Where you live is not that important.”

Wired → Netflix’s Grand, Daring, Maybe Crazy Plan to Conquer the World

The Big Data Heist

The Big Data Heist

Every day, people give away their data to just a few shareholders using corporate giants as wealth managers. These data funds, such as Google or Facebook, then spin off AI-powered applications, that in turn become privately owned assets. As productive resources, data, AI and their byproducts are about to replace most jobs of the working class, even those of relatively senior executives. Many predict that AI-correlated jobs will only compensate for a tiny bit of those redundancies. This issue of “jobless growth” is a core characteristic of our transition into industry 4.0, and has pushed even Bill Gates to suggest that a tax on AI should be an option.

PersonalData.IO → The Big Data Heist

 

The Arrival of Artificial Intelligence

The Arrival of Artificial Intelligence

What is kind of amusing — and telling — is that as John McCarthy, who invented the name “Artificial Intelligence”, noted, the definition of specialized AI is changing all of the time. Specifically, once a task formerly thought to characterize artificial intelligence becomes routine — like the aforementioned chess-playing, or Go, or a myriad of other taken-for-granted computer abilities — we no longer call it artificial intelligence.

Stratechery → The Arrival of Artificial Intelligence