As it turns out, Silicon Valley tech giants aren't the only institutions surreptitiously collecting massive troves of sensitive data from unsuspecting consumers. On Friday, Canada's the Global Times published a report exposing a recently launched data collection program adopted by StatCan, the Canadian government's economic research agency, that the agency introduced to help it collect more accurate data about consumers' spending habits. The agency has asked Canada's nine largest banks to turn over all the transaction records and sensitive identifying financial information (including customer's social insurance numbers) for 500,000 randomly selected Canadians. The agency will collect and crunch this data as part of its statistical research and then, at the end of the year, it will produce a new list of 500,000 Canadians, and perform all of the same operations with their data.
After being called out by Global News, the agency explained that the data would be anonymized shortly after being compiled (meaning that all identifying information, like consumers' SINs, would be removed).
"Canadians should know we are not accessing all of the payments data for all Canadians. It’s a small sample relative to the total number of households," he said. "Our access to this data is permitted through both the Privacy Act and the Statistics Act."
But that's not exactly true. The fact that it didn't publicly disclose the plan has left some Canadians feeling uneasy. Given that Canada has a population of roughly 20 million people, the likelihood that any one individuals' information will be collected. To be sure, the agency said in a letter to Canada's privacy commissioner that the data would only be used for statistics purposes. But a former privacy regulator who spoke with GN said she was "shocked" to learn of the program.
Ontario’s former privacy commissioner, Ann Cavoukian, said she was shocked by the initiative and said the ability for a government agency to build a massive database of personal banking information raises serious privacy concerns.
"Most people would be surprised and devastated if they thought all of their financial information and bills and activity were being accessed in identifiable form by Statistics Canada or any branch of government," she said. "Medical and financial records are the most sensitive personal data that exists."
As Global Times's chief political correspondent explained in an editorial criticizing StatCan's surreptitious collection program, the agency has long struggled to collect accurate data about Canadians' spending habits by employing a staff of interviewers who phone everyday Canadians and ask them about their spending habits. Say the agency wanted to determine how much money the average Canadian male between the ages of 24 and 50 spent on iTunes every one. Well, its staff of 1,000 interviewers would call thousands of Canadian citizens with this demographic profile and ask them.
But there's one glaring problem here: Who remembers exactly how much money they spent on iPhone apps and music downloads in any given month? And few people have the time, or the willingness, to check their credit card records and share specific dollar amounts. And even if some did, how would the agency verify whether they were being truthful?
But if StatCan wanted to know what the average 50-year-old male with a cat living in suburban Ottawa spends in music downloads from Apple’s iTunes Store every month, it would have to convince me and other men with those characteristics to participate in a survey - a survey that might be done by phone, by mail or online.
Indeed, Statistics Canada employs the equivalent of nearly 1,000 people as “interviewers,” who spend all day asking everyday Canadians and businesses about their activities so everything we do can be counted up.
For much of the important data about our economy and household spending - upon which many important decisions, such as interest rates and taxation levels, are based - Statistics Canada relies heavily on surveys.
But surveys have an accuracy issue. Do you remember how much you spent on groceries last April? Last month? How much butter did you consume? How many times did you fill up at the gas pump? You might have roughly accurate answers to these questions, but they are probably not as precise as a data scientist would like.
On top of that, StatCan has the same problem that pollsters have: people these days just don’t want to answer the phone or go online for a lengthy survey.
So StatCan devised a plan to improve the accuracy and efficiency of its data collection.
As a result, researchers at StatCan came up with another idea: to feed a computer program to the agency’s massive database of 20 million or more "households." That database will spit out a list of 500,000 “household” members, who together create a representative sample of the entire country. The list would have the same ratios of men to women, French speakers to English speakers and Calgarians to Haligonians that actually exist in the country.
The representative sample would also be chosen randomly, and on that list would be the name of each person, their social insurance number, date of birth, home address and gender. You would have a one-in-20 chance of being on this list.
But then, next year and the year after that, a new list would be drawn up. Eventually, the odds of making it onto the list would drastically improve for millions of Canadians.
Once the list has been generated, those 500,000 names would be given, under strict privacy controls, to each of Canada’s nine largest banks and credit card companies. Because your bank or credit card company also likely knows your name, SIN, date of birth and so on, each financial institution would be able to draw up its own list of customers that are also on the StatCan list.
Canada's largest banks are worried that StatCan's collection program could inspire Canadians to bank with smalle institutions that aren't subject to the collection. And although StatCan has promised to anonymize the data, the fact remains that no institution is immune to hackers, lest of all government agencies. And now that hackers know StatCan possess this invaluable trove of sensitive personal data, we wouldn't be surprised to learn that some one, somewhere, will try and steal it.
Read the documents detailing StatCan's collection efforts below: