Defining Privacy
A critical investigation of Canadian political discourse


5 Conclusion

In order to have a reasonable expectation of privacy, the meaning of privacy must first be understood. While the privacy of personal information is protected in Canada by a number of federal statutes, none of them include a definition or explanation of the meaning of privacy.

Through a two-stage text analysis of all the parliamentary debates between the 39th and the 41st Parliaments, it was determined that privacy was narrowly defined by the House of Commons during that period as a ‘right to secrecy for Canadians who abide by the law’.

Research Summary

This conclusion was drawn from the results of two complementary methods of text analysis as well a review of the privacy legislation, literature, and jurisprudence in Canada.

The review of the legislation, literature, and jurisprudence was primarily contained to documents pertaining to the federal level of governance in Canada, the House of Commons. The review determined that privacy, in terms of Canadian federal legislation such as the Privacy Act and PIPEDA, is overwhelmingly concerned with the protection of personal information, as it relates to its collection, use, disclosure, and destruction. While privacy is not a right in Canada, the Privacy Act has been considered to be a quasi-constitutional piece of legislation, which is a symbol of the importance of privacy and privacy legislation in a free and democratic society such as Canada. PIPEDA has been characterized as fundamental, but not quasi-constitutional. This review also summarized some of the major philosophical concepts regarding the importance of privacy to society, coming to the conclusion that a reasonable expectation of privacy amounts to a reasonable expectation of anonymity. Anonymity can be understood as the freedom of an individual to negotiate societal expectations, both in public and in private, while remaining free from identification.

The first method of text analysis processed the text electronically as a means of generating frequency and concordance data. The text was compiled into a corpus from the publication known as Hansard, which consists of the complete transcripts of the parliamentary debates in the House of Commons. The corpus was organized as two distinct parts: yearly, from 2006 - 2015 inclusive; and by Parliament and Session, starting with the 1st Session of the 39th Parliament and ending with the 2nd Session of the 41st Parliament. Using the coding language Python, information about the frequency of the word ‘privacy’ was determined and concordances were generated, which showed the context of ‘privacy’ at the level of the sentence.

The results of this first text analysis informed the selection of a specific debate for the second analysis, which was chosen as a result of the frequency of the phrase ‘privacy rights’. This phrase had the highest relative frequency in 2014 compared to any of the other years in the corpus. The sitting of May 5, 2014 was chosen for analysis, again due to the fact that it had the highest relative frequency of the phrase ‘privacy rights’ compared to any other sitting in that year. The transcript was analyzed using Fairclough’s method of Critical Discourse Analysis, which argues that language can only be understood within the context of its use. This requires an interpretation and examination of the social structures, institutions, practices, roles, and relations that collectively led to the construction of the text under analysis. The analysis determined that the debate of May 5, 2014 was a representation of the political practice of parliamentary debate, which consists of political actors engaging in persuasive practical argumentation with the purpose of making a decision about an topic or situation. By nature, parliamentary debates are a struggle over power and meanings, as the constraints that exist due to the nature of the discourse type, notably the reliance on a majority vote, concentrate power in the hands of those who have the most physical representation in the institution of Parliament. Despite this imbalance of representation, which is organized in the House of Commons by ideological party, the results of this particular analysis determined that the MPs in this debate shared a common ideology regarding the meaning of privacy. This meaning will be discussed in the Key Findings section of this chapter.

This research is significant for two reasons. First, it showcases the interdisciplinary value of text analysis as a methodology with the successful integration of electronic and critical text analysis. Despite the assertion by Fairclough that the combination of electronic methods of text analysis and CDA contribute to the creation of “watered down criticism” (Language and Power 36), I strongly believe that a research project of this nature would not have been possible without the combination of both methods. Both methods of analysis are subject to the inherent biases of the researcher, and as long as that subjectivity is acknowledged and ‘called out’ by the researcher through the course of the both analyses, there is no danger in the combination of methods.

Second, at the point of publication, a study of this breadth and depth has yet to be done on the transcripts of the Canadian Hansard. One of the key motivations of CDA is to expose the orders of discourse that contribute to the reinforcement or maintenance of the unequal distribution of power in society as a means of inspiring change. I can only hope that this study, or studies like it, will have a subtle impact in this regard.

Since I have posted my data and my code online, other researchers can conduct their own analyses, without having to download all of the files from the House of Commons website or learn a lot of code. I have shared the files in their original format, XML, as well as in a plain text format, which has been processed to remove the XML formatting. The code is easily accessible for the purpose of scrutiny or reuse.

Key Findings

To build on Finestone’s metaphor, privacy legislation in Canada really is a patchwork garden full of weeds (26), and the MPs responsible for tending the garden cannot even distinguish between the helpful plants and the invasive ones.

A ‘Charter right to privacy’

This characterization stems from the analysis of the May 5, 2014 debate, where Conservative MPs continuously referred to a ‘Charter right to privacy’, even though no such right exists. None of the opposition MPs in the House that day attempted to refute this claim. While s. 8 of the Charter has been used successfully as a privacy defence, it only applies in the case of the unreasonable disclosure of personal information where either the consent of the individual, or a warrant from the court has not been previously been obtained. Referring to privacy in this way is categorically false, and while this was proven conclusively in only one debate, the repeated occurrences of the phrase ‘privacy rights’ throughout the entirety of the Hansard corpus suggests that May 5th was likely not an isolated event.

The definition of ‘personal information’

This fundamental misunderstanding of the ‘right to privacy’ was compounded by the MPs inability to agree on the definition of personal information during the course of this particular debate. Despite the fact that personal information is generally defined in both the Privacy Act and PIPEDA to include “information about an identifiable individual” (PIPEDA, s. 2(1); Privacy Act, s. 3), the MPs spent a great deal of time quibbling about whether or not an IP address constituted ‘basic subscriber information’ in the way that a name or a phone number does, with some MPs using the analogy of ‘phone book’ to make their point. This is problematic for two related reasons. The first concerns the definition of personal information in the Privacy Act, and the second relates to the distinction between ‘information about identity’ and ‘information that identifies’.

The extended definition of personal information in the Privacy Act includes a qualifying phrase specifying that even the name of an individual is protected as personal information if it appears with other personal information relating to the individual (s. 3(i) ‘personal information’). What this means, in terms of the confusion around the meaning of basic subscriber information in the debate, is that the very fact that an IP address accompanied the name and phone number of an individual makes all of the information ‘personal.’ Because the IP address is related to an individual in a way that identifies them, the name, phone number, and IP address all equally deserve protection from disclosure. While the debate in question was generally on the topic of PIPEDA and Bill C-13, rather than the Privacy Act, any one of the MPs could have drawn on this definition as a point of clarification or to advance their argument. There were many occasions in this debate where MPs made reference to other pieces of legislation and Bills to advance their claims. The fact that none of them were aware of this important definition in a highly relevant piece of legislation speaks to a general lack of knowledge in the House about the meaning and intent of both of the Acts.

Related to this point is the distinction between ‘information about identity’ and ‘information that identifies’ (Slane and Austin 501). The reason the Privacy Act specifically calls out the relationship between an individual’s name and other information that relates to that information, is because linked information identifies. While it is true that names and phone numbers are freely available in publications like the phone book, IP addresses are not, precisely because they have the potential to reveal a vast amount of other personal information about an individual in a way that a name or phone number cannot. This concept was brought up in the debate in terms of a quote by former Privacy Commissioner Cavoukian, but never fully articulated or debated (Hansard Vol. 147 No. 80, 4902).

It’s important to note that the ruling in R. v. Spencer was about this very issue, and the judgment was made in favour of protecting personal information, which includes IP addresses, from warrantless disclosure.

Much like the misinterpretation of the Charter, it is highly unlikely that the MP’s confusion surrounding the definition of personal information was isolated to this single debate.

The ‘hidden discourse’ of privacy

This confusion speaks to a deeper undercurrent of ideological belief in the House of Commons that answers the question posed at the beginning of this thesis, which sought to determine the meaning of privacy as it is understood and used by MPs.

The key distinction between ‘information about identity’ and ‘information that identifies’ is anonymity. Anonymity is not the same as secrecy; it is not an argument for keeping every single piece of information about the self private through the act of concealment. Rather, anonymity involves the ability of individuals to experiment with thoughts and opinions without the risk of ridicule, shame, or punishment from others, regardless of whether they are in a public or private space (Westin 33-34). Neither is anonymity a cover for committing illegal acts. The Criminal Code, the Privacy Act, and PIPEDA all contain provisions that allow for the lawful disclosure of personal information when a crime has not only occurred, but is suspected to occur.

The continued reference to the ‘privacy rights’ of ‘law-abiding Canadians’ by all Members of the House during this debate exposes a major ideological discourse which implies that only Canadians who abide by the law deserve privacy. Yet the protections afforded by both pieces of federal privacy legislation, as well as the Charter of Rights and Freedoms, apply equally to all Canadians, whether or not they engage in criminal activities. The implication that criminals do not deserve privacy relies on the assumption that criminals only want privacy because they have something to hide. This means that MPs, at least in the context of their role of politician in the House of Commons, subscribe ideologically to the narrow definition of privacy as secrecy.

The marginalization of criminals and suspected criminals – who are already in a drastically reduced position of power in society due to legal proceedings, imprisonment, punishment, and shame – is not even the worst part about this discourse. The problem with defining privacy as secrecy, rather than as anonymity, is that it suppresses the power and autonomy of all citizens in Canadian society by implying that an individual’s need for privacy is not necessary when they have nothing to hide (Solove 746-747). Westin argues that privacy is essential to the development of a person’s complete and unique sense of self, but that this depends almost entirely on the ability to anonymously experiment with thoughts and ideas before testing them out in society or discarding them (33-34).

A lack of anonymity, whether real or implied by discourse, ultimately has a ‘chilling effect’ on society as a whole, where people are more likely to conform to societal expectations rather than freely express their thoughts, viewpoints, and political beliefs (Solove 765; Task Force 18). The definition of privacy as secrecy is contrary to the quasi-constitutional status afforded to the Privacy Act by the Supreme Court, which stated that the Act should serve as a “reminder of the extent to which the protection of privacy is necessary to the preservation of a free and democratic society” (Lavigne v. Canada 789).

This is yet another serious example of the MPs fundamental misinterpretation and misunderstanding of the legislative power they wield, which is deeply troubling in that the stated purpose of a parliamentary democracy holds that the law is the supreme authority.


The limitations of this research fit into two distinct categories: technical and theoretical. The technical limitations concern my ability to write code. The sheer size of the corpus, almost 69 million words, meant that web-based tools or programs were just not able to process the text. This resulted in my learning the coding language Python as I conducted the research. I made a lot of mistakes along the way, which resulted in several false starts and required many reinterpretations of the results. Due to size of the corpus, and my novice understanding of code, some of the data that was generated needed a substantial amount of cleaning before it could be considered for analysis. This applies primarily to the generation of collocational statistics, which ended up being the least necessary and most time consuming method of text analysis in the course of my research; consisting of the only data that I ultimately ended up not using. Since I was already interested in investigating a specific word, the collocational statistics merely confirmed the trends already discovered by the frequency statistics and concordances. In this regard I agree with Wermter and Hahn, who argue that simplicity is a good approach when it comes to text analysis (785); the most compelling trends in this research were the basic frequencies, and they were also the easiest to generate.

The theoretical limitation of this research concerns the inherent subjectivity of text analysis as a research method. Text does not simply come into being, nor does it provide an objective representation of reality, it is understood and influenced by the individual cognition of the researcher, the context within which it was originally created, and the relationship between the researcher, their cognition, and their interpretation of the text to its context. Both computerized text analysis and Critical Discourse Analysis are subject to the biases and preconceptions of the researcher; this begins with the selection of a text or text collection, continues through the choice and application of methods, and ends by strongly influencing the interpretation and discussion of the results. The theoretical intention of the researcher is present at every stage (Stubbs, Text and Corpus Analysis 154) and more than one theoretical intention can be applied to the same research problem (Kuhn 76).

Specifically, the use of CDA as a methodology implies that the researcher is strongly influenced by personal ideology. The aim of CDA is to critically analyze the ways in which language contributes to the struggle for power in society, which means that the researcher believes that power imbalances are something that requires criticism. If I was not concerned about my own need for informational privacy, it is unlikely that I would have chosen to research it. Similarly, if I was not concerned about the implications of Bill C-13 in terms of warrantless access of personal information, I would have chosen another body of text to analyze. The fact that the first stage of text analysis pointed to trends that I deemed worthy of further investigation confirms that I suspected the trends would exist.

Despite these limitations, the nature of text analysis, especially in terms of electronic methods, allows for the reproduction, examination, and critique of the work by others. This is the reason I have made my data and code publicly available. Publishing data and code serves the dual purpose of opening up my analysis to criticism and review, and promoting the application of text analysis as a methodology for future research.

Potential for future research

The most obvious potential for future research in terms of this specific collection of text would be a deeper investigation of some of the other trends identified by this research. While 2014 was an interesting year in terms of the frequency of privacy, so was 2011, and it would make sense to continue the CDA at that point in the text. Hansard continues to be published online, and it would be interesting to see if the House of Commons comprised of a new government compares to the previous government in terms of the discourse of privacy. This will require more time to pass, as the 1st Session of the 42nd Parliament has yet to be prorogued.

The public availability of the data and the code lends itself to the investigation of other topics, and another researcher could pick this up quite easily. The code has been published in iPython Notebook, which is a format that allows people with all levels of coding literacy to use the code, involving only a minimum degree of setup on their own computers. This means that the code can continue to be used in the analysis of the Hansard corpus, but also for other bodies of text. In this regard, it would be interesting to analyze the jurisprudence discussed in this research in order to do a comparative analysis of the difference between judicial and parliamentary discourse. This is an area where collocational statistics may have more relevance, although I will continue to advocate for simplicity, rather than complexity in the selection of methods for text analysis.

Interest in the acquisition of coding literacy among librarians and scholars in the humanities and social science disciplines is increasing. The structure and methodology of this thesis contributes to the growing body of interdisciplinary research that studies text and discourse from the dual perspectives of close and distant reading.

Final Thoughts

Distant reading, as it is described by Moretti, involves stepping so far back from a collection of text that it effectively disappears and cannot be traditionally read (57). From that distance, a unit of analysis can be defined and followed throughout the entirety of the collection (Moretti 61-62). In this case of this research, the unit of analysis was focused on a single word: ‘privacy’, which was then followed from the beginning to the end of the Hansard corpus. Metaphorically, the word rose from the text like a beacon, occurring exactly 6,478 times in a sea of almost 69 million other words. By employing the methods of text analysis, this distant reading instantly revealed patterns of frequency and use that would have otherwise required months to uncover by hand. These distinct areas of interest enabled a focused reduction of the distance, which allowed phrasal patterns (concordances) to emerge.

To paraphrase Adolphs, the discovery of frequencies and phrasal patterns provided a way into the text that was informed by the text itself (19). The close reading necessary for CDA was performed on a small selection of text that was most representative of the trends uncovered by the distance. Mautner argues that the method moving from a large body of text to a small selection helps to reduce some of the bias from CDA (123), much like the limitations discussed above. I believe that the distance does not reduce bias in terms of influencing the initial selection of text, which is her argument, but in the ability of the researcher to make a stronger critical claim as a result of the distance.

While nothing can be conclusively ‘proven’ by either method of text analysis, my critical claim that privacy is narrowly defined in the House of Commons as a ‘right to secrecy for Canadians who abide by the law’ is supported by the results of the distant reading, which uncovered clusters of phrases and patterns of word use that spanned the entire corpus. Furthermore, since I have openly provided both the data and the code, those who refute my claim can come to their own conclusions through the replication or advancement of my work.

Top of Page Home