≡ Menu

Data is as Data were. Emerging Language in Everyday Speech.

Data, Media, Rice, Water. Emerging language and winds of change.

Language changes. It grows. It adapts. Nouns are turned into verbs (e.g. “friend”), words take on many meanings (e.g. “peer”) and subject/verb agreement transforms. Scholars know that the phrase “correct English” is a misnomer at best, a downright falsehood at worst. Languages are living things that grow and change.

We are on the cusp of one of those changes now. It truly could go either way. For a language geek, it’s an exciting event to watch. How will the now-ubiquitous words “data” and “media” be treated? Will the educational system catch up and drill the original usage of “data” and “media” as being plural nouns that require a plural 3rd person verb agreement? Or will colloquial usage overwhelm the textbooks and the subject will be simple, single and quick?

Let’s go over some details.

Datum is a single piece of data. Data are more than one datum.
Medium is a single type of media. Media are all the mediums lumped together.

The subject/verb agreement with these words traditionally went like this:

The datum is written on a piece of paper.
The data are enclosed in the report.

The medium was radio.
The media were newspapers.

(Or, in the case of journalists as a group of people: “The media report a storm coming up the coast.”)

Usage of “data” has morphed into the singular subject/verb agreement for many colloquial speakers (that means “regular people speakers and not specialized people like academics, scientists, etc.) “Data” and “Media” are being treated as mass nouns, like rice (e.g. “The rice is in the cooker”) or water (e.g. “This water is cold!”). Now we are seeing usage like “The data doesn’t support your claim.” and “The media isn’t welcome in the courtroom.”

We are seeing the singular subject/verb agreement usage more with the word “data” and with the word “media.” I don’t think most people would have “medium” on the tip of their tongue if they were asked to name the singular of media, but journalists have been drilling us with their self-referential phrase forever. So we know what “media” is supposed to sound like in a sentence, for the most part (If “data” usage changes, then I think “media” won’t be far behind. But we’ll leave “media” be for now).

“Data” is another problem entirely. I’ve been intimately aware of the usage rules around the word “data” for my entire adult life. When I was 18, I started at the University of Pittsburgh in a Psychology major, and I was quickly treated to a grammar lesson I didn’t soon forget. After years of psychology and biophysics research, then on to business research, I knew the expected plural subject/plural verb conjugation for the word “data.”

But here we are at the crossroads, where seemingly everyone else besides the hardcore researchers use “data” as a mass noun. Sure, the Twitterati will do their best to knock you back into their supposed knowledge and comfort zone as soon as they see a wayward “data is” or “data was.” But they aren’t looking at the big picture. Let’s think for a moment about data. This is a perfect example of why language changes. A cultural change happens, then language reflects that change. (I am now going to start using “data” as a mass noun. That means I will be using it in the singular, so those of you who are grammar-feint-of-heart, I suggest you stop reading now. But I do wish you would just hold your breath for a second and hear me out.)

Data is everywhere. It is coming at us from all sides. We have many convenient ways to get data. We have to make an effort to avoid data. We are data junkies. All of us. But in the end, we see data as a separate entity from ourselves. It is something we consume, like water. We choose to step up to it like we walk to the ocean’s very edge. We make the choice to dip our toes into it, or run away. We have our favorite ways of getting data, just like we have our favorite shoreline beaches. But we see it as a huge mass, almost one big entity of which we take small parts. We make distinctions on its bits. The grains of rice are in the container, but my rice is already cooked. No drops of water are on the window but water is leaking in everywhere. Bits of data are scattered around the internet but my data is on my blog. Wikipedia defines as mass noun as such:

“In linguistics, a mass noun (also uncountable noun or non-count noun) is a common noun that presents entities as an unbounded mass.”

An unbounded mass. Think about that. Think about all the info on the internet. Doesn’t it feel like “an unbounded mass” to you?

(ok grammarians, you can let out that breath. wasn’t too bad, was it?)

See what I mean? Which way will this go? Will data be accepted as a mass noun in the general culture? Or will everyday speakers be exposed to the word in its plural form so much that the phrase “the data are everywhere” sounds right to them?

Let me know what you think in the comments. Your data is/are important to me.

Christine Cavalier, PurpleCar


UPDATE: Here is the paragraph on usage of the word “data” from Merriam-Websters:

usage Data leads a life of its own quite independent ofdatum, of which it was originally the plural. It occurs in two constructions: as a plural noun (like earnings), taking a plural verb and plural modifiers (as these, many, a few) but not cardinal numbers, and serving as a referent for plural pronouns (as they, them); and as an abstract mass noun (like information), taking a singular verb and singular modifiers (as this, much, little), and being referred to by a singular pronoun (it). Both constructions are standard. The plural construction is more common in print, evidently because the house style of several publishers mandates it.”

Comments on this entry are closed.

  • PurpleCar 29 January 2010, 11:14 am

    Hey all. Found a related article via Twitter, from Stan Carey: http://stancarey.wordpress.com/2009/05/07/data-is-data-or-are-they/ Will post updates as I find them.

  • Another Language Geek 29 January 2010, 1:09 pm

    For a language geek, you aren’t very careful. “As a language geek, it’s an exciting event” needs “PurpleCar” as the referent for “language geek”, not “event”. It isn’t the event that is the language geek.
    Of course language changes. That doesn’t mean that all change is good, Barack Obama notwithstanding. We can retard changes that are not helpful and changes that actually slow readers down because they can be misinterpreted. Writers have a responsibility to choose only helpful changes.

    • PurpleCar 29 January 2010, 2:53 pm

      Ack! I wrote this quite late last night, and I meant to write “For a language geek.” Changing now. Thanks!



    • PurpleCar 29 January 2010, 2:56 pm

      OK, that’s done.
      Thanks for commenting, ALG.

      This change with the verb usage is not “helpful” or “unhelpful.” It simply is an organic change reflecting the change in culture. You can hem and haw all you want, but this is not a change that I’ll slow down or fight against in any way. Good luck to you if you’d like to fight the good fight.



  • Andrew 29 January 2010, 1:33 pm

    As you allude to, with your memorable experience at Pittsburgh, the treatment of the numeracy of “data” and “media” exhibits a social distribution across genres of usage, or ‘registers,’ as linguists sometimes call them. The historically ‘correct’ usage of both still has robust (even punitive) support in academic and other specialist registers, so, as long as that persists, any other usage in the “general culture” is susceptible to criticism. And invoking your metaphorical motivation in defense of the more novel usage, while creative, is awkward, at best. I don’t go in much for prescriptivism of any sort; I’m just pointing out that an authoritative value is attached to the ‘correct’ use, and speakers ignore that asymmetry at their peril. You never know who in your audience has a degree in Psychology. 😉

    • PurpleCar 29 January 2010, 3:04 pm

      Punitive. LOL. Great word. And ain’t that the truth. 😉

      Metaphorical motivation wasn’t my intent. My article is for a non-academic audience, specifically the Twitter grammar police who are not professional linguists in the least. Their knowledge of language is cursory at best. This is just an introduction to the issue for a general audience. I can understand how linguists would be annoyed at my use of metaphor to explain the concept of a mass noun.

      I agree that it’s best for all users to avoid subject/verb agreement phrasing with the words “data” and “media.” Stick ’em at the end of the sentence, I tell writers, or use “bits of data” or “pieces of media.” At this juncture, taking my political stance requires a bit of bravery in face of risk. I tend toward the populist nature, so naturally I’d take this route myself, but I don’t presume to require compliance from that “general culture” to which I refer. (<— I didn't end that phrase in a preposition for your sake.:-) )

      Christine Cavalier, PurpleCar


      • Andrew 29 January 2010, 9:00 pm

        Hah, Not to nitpick, but as I said, I’m not a prescriptivist when it comes to grammar. The vast majority of linguists do not subscribe to a conventional notion of ‘correctness’ when it comes to grammar. They’re more interested in a descriptive approach that uses native judgments of correctness. So, in fact it’s typically a non-expert audience that is most receptive to rigid notions of correctness and most likely to fellow speakers for perceived errors. My point about the metaphor isn’t that it’s annoying at all. It’s creative! And whenever you examine these kinds of linguistic phenomena closely, you find that native speakers can often produce elaborate motivations for a particular usage. (When they’re widely shared, linguistic anthropologists refer to them as ‘folk ideologies’) My point was a more prosaic one: In the moment, when someone tells you you’ve just used ‘data’ incorrectly, it’s a bit unwieldy to uncoil the notion that we’re all adrift in a sea of data. Anyway, your advice to writers is good. It’s always better to avoid these little trouble spots entirely. Fun stuff, and that’s what I come here *for*. 😉

        • PurpleCar 30 January 2010, 12:39 am

          LOL glad to see my odd humor is only esoteric most of the time.

          I love folk ideologies as an area of study. Fascinating. So many great pop books have been published lately in the behavioral economics realm about motivations and how people dream up elaborate justifications for them, when in reality one’s decisions are mostly influenced by the environment and events directly preceding the decisions. (Stumbling on Happiness by Daniel Gilbert is highly entertaining along these lines).

          My interests tend to be at the crossroads of cultures. My masters thesis concentrated on the use of African-American dialect (Philadelphia, PA) in classroom settings in public schools. I wanted to see how different teachers looked on the colloquial use in subjects other than English or English Lit. Did they take a prescriptivist or descriptivist approach? How did the students speak? I presented the data in ethnography form as well as more quantitative methods. I’m more of a psycholinguistics person in general, rather than a grammar geek. I had some exposure to C.A. Perfetti at Pitt as an undergrad, and he blew me away. Also, Pitt’s Philosophy department taught me everything I needed to know about theory… I digress.

          So now, years after my research has long finished, I find myself isolated from academia and immersed in a popular yet niche culture. My descriptivist and populist ethics send me into a tailspin when confronted by the inane attitudes of some in this crowd, and out come blog posts or Twitter conversations like these. I try to spread some tolerance and understanding, but with language, I find, the folk ideologies never get enough credence by the folks themselves! Perhaps it is our stark Calvinist or Protestant background that drives us toward a schoolmarmesque attitude toward grammar rules. Who knows?

          One thing I do know is that the grammar police on Twitter are trés annoying and should be avoided at all costs.

          Sorry I jumped on your “metaphor” comment. I thought perhaps you were
          saying I was using the equivalent of anecdotal evidence (*shudder*).
          You may be right: the explanation may prove bulky. I’m hoping the presentation of one incidence of a mass noun will be enough to convince this tech crowd to disregard their steadfast theories. Usually, if programmers are presented with just one anomaly, they’ll go to work immediately rewriting the code.

          I suppose I shouldn’t bother. This post has already attracted some intolerance and mimicry, despite the absence of any claims of my expertise in this area. I’m interested; I’m aware of theory. I’d like to spread some new thoughts of acceptance and tolerance to my current cohort. It should be simple. Yet I find that my tech community is less tolerant of language changes than any other group I’ve experienced. We could talk for hours about schema or gestalts or egos or stick-your-favorite-personality-theory-in-here and we’d still get nowhere on the road to understanding the pure fear and insecurity that gets roiled up when these people spot “incorrect” English. For the love of PETE!

          Anyway, I’m not looking to explain away why we should move to a mass noun usage for “data.” I’m trying to reach a community that needs to understand change in certain ways. I personally don’t care all that much about why it’s changing, just the fact that it is changing. I thought the sea metaphor might give my folks some basic understanding of the concept of mass noun and how “data” fits into it.

          I probably should go back to posting about Facebook.



  • pienkovski 30 January 2010, 9:15 am

    To see that the language is an alive organism is the first step to understand the relationship between humans.
    Congratulations, Christine Cavalier. Your article is excellent, your English is beautiful.
    I would like write like you.

    • PurpleCar 30 January 2010, 12:54 pm



      Thank you! And you are right, Mr. Pienkovski, the first step is knowledge. And it’s nice to know you. Please come again soon.

      -Christine Cavalier, PurpleCar


  • aceblack1965 7 February 2010, 11:55 pm

    Nice post Christine. I think the battle for common sense has already been won. Those still complaining about the use of Data as a mass noun are simply out of touch. I’ve been using Data as a mass noun for about 20 years — I simply never saw any reason for the words “information” and “data” to be treated differently. Who would say “the information are important?” By the way, “staff” is another word heading in the same direction.

    • PurpleCar 8 February 2010, 1:31 pm

      Thanks, Ace!

      Wow, how many times people tried to “correct” your English for all these years? What did you do when that happened? The Grammar Gestapo rules high and mighty on Twitter especially.

      “Staff” is an interesting example. I’ve seen both uses. Mostly, I’ve seen that if a speaker is referring to a nearby, currently existing group of employees, they use “are” (e.g. The staff are downstairs watching the news about the storm right now), and if the speaker is referring to every employee at once, they use “is” (e.g. The staff is pleasantly surprised in the recent expansion of benefits). Good catch on that one, Ace. It hadn’t blipped on my radar.

      -PurpleCar Christine Cavalier