B. , R. Christen, P. Et, C. , and T. , A comparison of fast blocking methods for record linkage, ACM SIGKDD '03 Workshop on Data Cleaning, Record Linkage and Object Consolidation, 2003.

B. and T. , A proposed improvement in computer matching, Statistics of Income and Related Administrative Record Resarch, pp.167-172, 1990.

B. , T. Et, R. , and D. , A method for calibrating false-match rates in record linkage, Journal of the American Statistical Association, vol.90, issue.430, pp.694-707, 1995.

B. , M. Cohen, W. Feinberg, S. Mooney, R. Et et al., Adaptive name-matching in information integration, IEEE Intelligent System, vol.18, pp.16-23, 2003.

B. , M. Jolley, D. Sundararajan, V. Evans, S. Pilcher et al., Data linkage : A powerful research tool with potential problems, BMC Health Services Research, vol.10, p.346, 2010.

A. Borg and M. Sariyar, RecordLinkage : Record Linkage in R, 2016.

G. Box, C. Et, and D. , An analysis of transformations, Journal of the Royal Statistical Society. Series B (Methodological), vol.26, issue.2, pp.211-252, 1964.

C. , R. ;. , S. Ganti, V. Et, M. et al., Regression analysis of probability-linked data, Official Statistics Research Series, 4. CHANDURI, pp.865-876, 2005.

C. , P. Et, S. , and J. , Bayesian classification (AutoClass) :theory and results, Advances in Knowledge Discovery and Data Mining, 1997.

C. and P. , A comparison of personal name matching : Techniques and practical issues, Sixth IEEE International Conference on Data Mining-Workshops (ICDMW'06), pp.290-294, 2006.

C. and P. , A two-step classification approach to unsupervised record linkage, Proceedings of the Sixth Australasian Conference on Data Mining and Analytics, vol.70, pp.111-119, 2007.

C. and P. , Automatic training example selection for scalable unsupervised record linkage, Proceedings of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD'08, pp.511-518, 2008.

C. and P. , Febrl-: An open source data cleaning, deduplication and record linkage system with a graphical user interface, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1065-1068, 2008.

C. and P. , Data Matching : Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection, 2012.

C. , M. Kurien, V. Lalk, G. Et, S. et al., Efficient data reconciliation, Information Science, vol.137, issue.1, pp.1-15, 2001.

C. and W. , Data integration using similarity joins and world-based information representation language, ACM Transactions on Information Systems, vol.18, issue.3, 2000.

C. and W. W. , Integration of heterogeneous databases without common domains using queries based on textual similarity, pp.201-212, 1998.

C. , J. Hilton, and F. , Record linkage : Statistical models for matching computer records, Journal of the Royal Statistical Society. Series A (Statistics in Society), vol.153, issue.3, pp.287-320, 1990.

C. , A. Et, M. , and L. , Apprentissage Artificiel : Concept et Algorithmes. Algorithme. Eyrolles, 2010.

D. , A. Laird, N. Et, R. , and D. , Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, vol.39, issue.1, pp.1-38, 1977.

D. , J. Et, T. , and V. , Validating distance-based record linkage with probabilistic record linkage, Topics in Artificial Intelligence, pp.207-215, 2002.

D. , S. Et, C. , and P. , Data Mining for Bioinformatics, 1997.

E. , M. Verykios, V. Et, E. , and A. , TAILOR : a record linkage toolbox, Proceedings 18th International Conference on Data Engineering, pp.17-28, 2002.

E. , A. Ipeirotis, P. Verykios, and V. , Duplicate record detection : A survey, IEEE Transactions on Knowledge and Data Engineering, vol.19, issue.1, 2007.

F. , I. Et, S. , and A. , A theory for record linkage, Journal of the American Statistical Association, vol.64, issue.328, pp.1183-1210, 1969.

F. , J. Roberts, C. Et, T. , and L. , Characteristics of unmatched maternal and baby records in linked, Journal de la Société Française de Statistique, vol.159, issue.3, pp.329-337, 2006.

F. , M. Liseo, B. Nuccitelli, A. Scanu, and M. , On Bayesian record linkage, Research in Official Statistics, vol.4, issue.1, 2001.

F. and J. , Algorithme "EM" : Théorie et application au modèle mixte, Journal de la Société Française de Statistique, vol.18, issue.3-4, pp.57-109, 2002.

F. , I. Schwarzinger, M. Binquet, C. Benzenine, E. Hill et al., Contribution of record linkage to vital status determination in cancer patients, Studies in Health Technology and Informatics, vol.150, pp.91-95, 2009.

G. and L. , Ox-link : The oxford medical record linkage system, 1999.

G. , H. Carpenter, J. Kenward, M. Et, L. et al., Multilevel models with multivariate mixed response types, Statistical Modelling, vol.9, issue.3, pp.173-197, 2009.

G. , H. Harron, K. Et, C. , and M. , A scaling approach to record linkage, Statistics in Medicine, vol.36, issue.16, pp.2514-2521, 2017.

G. , H. Harron, K. Et, W. , and A. , The analysis of record-linked data using multiple imputation with data value priors, Statistics in Medicine, vol.31, issue.28, pp.3481-3493, 2012.

G. , S. Koudas, N. Marathe, A. Srivastava, and D. , Merging the results of approximate match operations, Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol.30, pp.636-647, 2004.

H. and S. , Analysis of qualitative data, vol.2, 1979.

H. , K. Doidge, J. Knight, H. Gilbert, R. Goldstein et al., A guide to evaluating linkage quality for the analysis of linked data, International Journal of Epidemiology, vol.46, issue.5, pp.1699-1710, 2017.

H. , K. Wade, A. Gilbert, R. Muller-pebody, B. Goldstein et al., Evaluating bias due to data linkage error in electronic healthcare records, BMC Medical Research Methodology, vol.14, p.36, 2014.

H. , T. Tibshirani, R. Friedman, and J. , The Elements of Statistical Learning, 2001.

H. , M. Stolfo, and S. , Real-world data is dirty : Data cleansing and the merge/purge problem, Data Mining and Knowledge Discovery, vol.2, issue.1, pp.9-37, 1998.

H. , T. N. Scheuren, F. J. Winkler, and W. E. , Data Quality and Record Linkage Techniques, 2007.

H. , M. Et, Z. , and A. , Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables, Statistics in Medicine, vol.31, issue.30, pp.4231-4242, 2012.

H. , M. Et, Z. , and A. , A mixture model for the analysis of data derived from record linkage, Statistics in Medicine, vol.34, issue.1, pp.74-92, 2015.

I. and A. , Modern Multivariate Statistical Techniques : Regression, Classification and Manifold Learning. Springer Texts in Statistics, 2008.

J. , S. Et, N. , and R. , A split-merge Markov chain monte carlo procedure for the Dirichlet process mixture model, Journal of Computational and Graphical Statistics, vol.13, issue.1, pp.158-182, 2004.

J. and M. , Probabilistic linkage of large public health data files, Statistics in Medicine, vol.14, issue.5, pp.491-498, 1995.

J. and M. A. , Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida, Journal of the American Statistical Association, vol.84, issue.406, 1989.

J. , M. A. States, U. Washington, and P. Jurczyk, UNIMATCH : a record linkage system : users manual, FRIL : Fined-grained Record Integration and Linkage Tool Tutorial, 1978.

K. and P. , Robustness of the census bureau's record linkage system, Proceedings of the Section on Survey Research Methods, pp.620-624, 1986.

K. , G. Chambers, and R. , Regression analysis under incomplete linkage, Computational Statistics & Data Analysis, vol.56, issue.9, pp.2756-2770, 2012.

K. , G. Chambers, and R. , Regression analysis under probabilistic multi-linkage, Statistica Neerlandica, vol.66, issue.1, pp.64-79, 2012.

L. , P. Larsen, and M. , Regression analysis with linked data, Journal of the American Statistical Association, vol.100, issue.469, pp.222-230, 2005.

L. , A. Jougla, E. Et, R. , and G. , Base AMPHI : Base de données pour l'Analyse de la Mortalité Post-Hospitalisation en France en, vol.159, pp.2102-6238, 2008.

L. and M. , Advances in record linkage theory : Hierarchical Bayesian record linkage theory, 2005.

L. , M. Et, R. , and D. , Iterative automated record linkage using mixture models, Journal of the American Statistical Association, vol.96, issue.453, pp.32-41, 2001.

L. , T. L. Fox, M. P. Fink, and A. K. , Applying Quantitative Bias Analysis to Epidemiologic Data, 2009.

L. , S. Richard, J. , R. , G. Beck et al., Testing the Acceptability of Asking Respondents for Identifying Information in a Cross-Sectional Survey of the General Population, Population, English edition, vol.72, issue.4, pp.697-713, 2017.

L. , E. Srivastava, J. Prabhakar, S. Richardson, and J. , Entity identification in database integration, Informatics and computer science, issue.1, p.89, 1996.

L. and A. , A Bayesian record linkage methodology for multiple imputation of missing links, Dossiers Solidarité Santé, vol.64, 2004.

M. , G. Krishnan, and T. , The EM Algorithm and Extensions, 2008.

M. , X. Et, R. , and D. , Maximum likelihood estimation via the ECM algorithm : A general framework, Biometrika, vol.80, issue.2, pp.267-278, 1993.

M. , X. Van-dyk, and D. , The EM algorithm-an old folk-song sung to a fast new tune, Journal of the Royal Statistical Society : Series B (Statistical Methodology), vol.59, issue.3, pp.511-567, 1997.

M. , A. Elkan, and C. , The field matching problem : Algorithms and applications, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp.267-270, 1996.

M. , A. Elkan, and C. , An efficient domain-independent algorithm for detecting approximately duplicate database record, 1997.

N. , J. Maynes, E. Ramanathan, and R. , The effect of mismatching on the measurement of response error, Journal of the American Statistical Association, vol.60, issue.312, pp.1005-1027, 1965.

N. , H. Et, K. , and J. , Automatic linkage of vital records, Science, vol.130, issue.3381, pp.954-959, 1959.

N. , J. Pearson, and E. , On the problem of the most efficient tests of statistical hypotheses, Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol.231, pp.289-337, 1933.

N. , K. Mccallum, A. Thrun, S. Et, M. et al., Text classification from labeled and unlabeled documents using EM, Machine Learning, vol.39, pp.103-134, 2000.

P. and L. , Hanging on the metaphone, Computer Language, vol.7, issue.12, pp.39-44, 1990.

Q. , C. Gouyon, B. Avillach, P. Ferdynus, C. Sagot et al., Using discharge abstracts to evaluate a regional perinatal network : Assessment of the linkage procedure of anonymous data, International Journal of Telemedicine and Applications, 2009.

R. , E. Sorlie, P. Et, J. , and N. , Probabilistic methods in matching census samples to the national death index, Journal of Chronic Diseases, vol.39, issue.9, pp.719-734, 1986.

R. , D. Belin, and T. , Recent developments in calibrating error rates for computer matching, Conference Paper 1991 Annual Research Conference :Proceedings : Bureau of the Census, 1991.

S. , M. Et, F. , and S. E. , A generalized Fellegi-Sunter framework for multiple record linkage with application to homicide record systems, Journal of the American Statistical Association, vol.108, pp.385-397, 2013.

S. , M. Borg, A. Et, P. , and K. , Controlling false match rates in record linkage using extreme value theory, Journal of Biomedical Informatics, vol.44, issue.4, pp.648-654, 2011.

S. and J. , Analysis of Incomplete Multivariate Data, 1997.

S. , F. Winkler, and W. , Regression analysis of data files that are computer matched, Survey Methodology, 1993.

S. and J. , A method for consideration of conditional dependencies in the Fellegi and Sunter model of record linkage, Statistical Papers, vol.46, issue.3, pp.433-449, 2005.

S. , J. Et, C. , and N. , Kernel Methods for Pattern Analysis, 2004.

S. and R. , Entity resolution with empirically motivated priors, Bayesian Analysis, vol.10, issue.4, pp.849-875, 2015.

S. , R. Hall, R. Et, F. , and S. , A Bayesian approach to graphical record linkage and de-duplication, 2013.

S. , R. Hall, R. Et, F. , and S. , SMERED : A Bayesian approach to graphical record linkage and de-duplication, 2014.

S. , R. Ventura, S. Sadinle, M. Et, F. et al., A comparison of blocking methods for record linkage, Privacy in Statistical Databases, pp.253-268, 2014.

T. and R. , New York State Identification and Intelligence System, Technical Report Special, issue.1, 1970.

T. , A. Liseo, and B. , A hierarchical Bayesian approach to record linkage and population size problems, The Annals of Applied Statistics, vol.5, issue.2, pp.1553-1585, 2011.

T. , A. Liseo, and B. , Some advances on Bayesian record linkage and inference for linked data, Proceedings of the ESSnet Data Integration Workshop, 2011.

T. and Y. , The discrimination power of dependency structures in record linkage, SURVEY METHODOLOGY, vol.19, issue.1, 1993.

T. , V. Et, D. , and J. , Record linkage methods for multidatabase data mining, éditeur : Information Fusion in Data Mining, numéro 123 de Studies in Fuzziness and Soft Computing, pp.101-132, 2003.

T. , M. Méray, N. Ravelli, A. Reitsma, J. Et et al., Ignoring dependency between linking variables and its impact on the outcome of probabilistic record linkage studies, Journal of the American Medical Informatics Association : JAMIA, vol.15, issue.5, pp.654-660, 2008.

V. , V. Elmagarmid, A. Et, H. , and E. , Automating the approximate record-matching process. information Science, vol.126, pp.83-98, 2000.

W. and W. , Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage, Bureau of the Census Statistical Research Report Series, 1988.

W. and W. , String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage, 1990.

W. and W. , Improved decision rules in the Fellegi-Sunter model of record linkage, 1993.

W. and W. , The state of record linkage and current research problem. Rapport technique, Statistical Research Division, 1999.

W. and W. , Machine learning, information retrieval and record linkage. Rapport technique, 2000.

W. and W. , Methods for record linkage and Bayesian networks, 2002.

W. and W. , Automatically estimating record linkage false match rates. Rapport technique, Statistical Research Division, 2007.

W. , G. ;. Wu, C. Et, J. , and F. , On the convergence properties of the EM algorithm, Proceedings of Western Users of SAS Software, vol.11, pp.95-103, 1983.