General counting 

Number of years51
Corpusacl, acmtslp, alta, anlp, cath, cl, coling, conll, csal, eacl, emnlp, hlt, icassps, ijcnlp, inlg, isca, jep, lre, lrec, ltc, modulad, mts, muc, naacl, paclic, ranlp, sem, speechc, tacl, tal, taln, taslp, tipster, trec
Total number of couples11500
Number of Selfreuse couples4538
Number of Self-plagiarism couples6751
Number of Reuse couples88
Number of Plagiarism couples123

With symetric couples

ex: P13-2046/acmtslp2013-6 + acmtslp2013-6/P13-2046

 Matrix for: all types of copy&paste 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl2381491367927312284852834950020714019251862912002449880
acmtslp100000002002320601100002001010020024
alta302001501240010400400000010000000432
anlp700134711214000100400102100000002547
cath100122000101000000300000000000000213
cl900430502432010002900101304000000053
coling741038862192317154351925744014914093351352730001365646
conll26111120188461811214220291030705130100300179
csal3000044270232210380270000110260001400119
eacl1620253112631813312900211010121140000501161
emnlp9922124451271610163114462270528008024211902002017507
hlt841205349481242143423303141050426201426109800025721621
icassps1650003411314824261201023001900201620066000752032342
ijcnlp27610032910723418244705193080135930000501225
inlg800116520325011201600105000000010051
isca5723020144603201025117153911387101013420012039601234000679054178
jep000000000000000090000000000006000015
lre210002300001000202600001100000010022
lrec57302616795141517171710272053651206011115126200514525
ltc300000000000020150135100200651400000084
modulad00000000000000000000000000000000000
mts13000029202911390100220207086211000200121
muc30020240010700000000001110000000018150
naacl481002124307121123616224310316109042091000803303
paclic400010120111029030518803002171000010098
ranlp3200002442210700021940201242100000165
sem252000717134112120800013121010814520000001194
speechc00000110110041710500020000000010001800106
tacl11000000001000000020000000100000006
tal0000000000000000500000000000001300018
taln0000000000000000300000000000053900065
taslp05000011151142070011800200102000170004900424
tipster3003006000150000002000131000000004745
trec1104112060221232830601100000100110000224295443
tot6279314504743350215164713236149021832454024961714866173011031261935619234985922162967375 

 Matrix for: self-reuse 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl15610467311013755206177180920108112631412001604374
acmtslp00000000000010020000000000000000003
alta100000200110000100100000010000000311
anlp200022201101000100100102000000000117
cath00002100000000000010000000000000004
cl300110101210000001200000002000000015
coling336114348105626911321209201031494930001004248
conll1311101311601127140101400004034010020091
csal2000032230107101500000000000400030043
eacl92001177210580125001110007014000020187
emnlp3722002925103395616080390010510901001403201
hlt379000221738218107143360281091400330001215235
icassps6300000031135157105460060010410018000354011138
ijcnlp720002183311821234031300201420000030195
inlg300102310121000101100001000000010019
isca1611010101007247204941045307543020610153000263021491
jep00000000000000000000000000000300003
lre00000110000000010030000010000001008
lrec281001634204594203001837504074181100203217
ltc3000000000000001011220000100200000022
modulad00000000000000000000000000000000000
mts2000012000301101011100200100000010027
muc00000100000100000000002000000000105
naacl3170201314221122513215028108010031000502152
paclic000000300000020101600100540000010024
ranlp0100001000110000011300101010000000021
sem5200033400520300064000010232000000173
speechc0000010030006002500100000000000070043
tacl01000000000000000020000000100000004
tal00000000000000001000000000000030004
taln0000000000000000000000000000015400019
taslp0100000040137100470000000100010002700156
tipster00000000000000000000001000000000001
trec300200001148000100400003000000008112147
tot2565559152281955515136189114768911912251662521504366434229087518772410143 

 Matrix for: self-plagiarism 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl7204468461518152563211423001150201113812120000845479
acmtslp100000002002220401100002001010020021
alta202001301130010300300000000000000121
anlp400112510113000000300000100000002429
cath10010100010100000020000000000000029
cl600320301221000001300000102000000027
coling4142732811131291740711530056930622931160000361384
conll13000177245441102101510303028000010085
csal100001204012140020027000000022000110069
eacl70024145421353004001000105100000030074
emnlp620012142616136624830219021900601711100100612295
hlt45305326318341214122216167021800412006500013614370
icassps10200034199341710210447001300101010046000394021160
ijcnlp204100111741161612130263060120630000200128
inlg500014210204011100500104000000000032
isca4112010335024761896102793411037916010032500181000413032651
jep000000000000000090000000000003000012
lre210001200001000102300001000000000014
lrec29202510453131110812824103528702047344100310300
ltc00000000000002014002380200551200000062
modulad00000000000000000000000000000000000
mts11000017202610280801920508421100010091
muc2002013001060000000000810000000017142
naacl163001111651010103109215018001020060000301143
paclic40001090111026020412702001631000000072
ranlp310000144210070001640100232100000144
sem20000041394171005000781010712190000000119
speechc000000108004110023001000000001000100059
tacl10000000001000000000000000000000002
tal0000000000000000400000000000001000014
taln0000000000000000300000000000038500046
taslp0400001110001126005600200101000140002200239
tipster3003006000150000002000121000000002742
trec70492060117247304006000070110000216176284
tot364389413020130192492941583621378145211204168140154064241864931952573411589755220 

 Matrix for: reuse 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl100010120040120100000000002000000015
acmtslp00000000000000000000000000000000000
alta00000000000000000000000000000000000
anlp00000000000000000000000000000000000
cath00000000000000000000000000000000000
cl00000010000100000020010110000000007
coling00000000000110010000000110100000006
conll00000000000000000000000000100000001
csal00000000000010010000000000000000002
eacl00000000000000000000000000000000000
emnlp00000101011000000000010000000000027
hlt10000101001001000000000010000000028
icassps000000000000000800000000000100010010
ijcnlp00000000000000000000000000000000000
inlg00000000000000000000000000000000000
isca000000001001400400110001000000000013
jep00000000000000000000000000000000000
lre00000000000000000000000000000000000
lrec00000000000010000000000000000000001
ltc00000000000000000000000000000000000
modulad00000000000000000000000000000000000
mts00000000000000000000000000000000000
muc10000010000000000000000000000000002
naacl10000000000000000000000020000000003
paclic00000000000000000000000000000000000
ranlp00000000000000000000000000000000000
sem00000010000000000000000000000000001
speechc00000000000001020000000000000000003
tacl00000000000000000000000000000000000
tal00000000000000000000000000000000000
taln00000000000000000000000000000000000
taslp0000000011006001000000000000100000019
tipster00000000000000000000000000000000000
trec00000000000010010000000000000000046
tot400012442263154028003102035042000108 

 Matrix for: plagiarism 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl000001100002010100110001111000000012
acmtslp00000000000000000000000000000000000
alta00000000000000000000000000000000000
anlp10000000000000000000000000000000001
cath00000000000000000000000000000000000
cl00000000000001000020000010000000004
coling00001000000101010020000100100000008
conll00000000002000000000000000000000002
csal00000000000100020000000110000000005
eacl00000000000000000000000000000000000
emnlp00000000000200000000000200000000004
hlt10000000001110020001010000000000008
icassps0000000010122002200000002000100030034
ijcnlp00000000000000000000000011000000002
inlg00000000000000000000000000000000000
isca0000011000001410300000000000000030023
jep00000000000000000000000000000000000
lre00000000000000000000000000000000000
lrec00000000102000010000000001010000017
ltc00000000000000000000000000000000000
modulad00000000000000000000000000000000000
mts00000000000100010000000010000000003
muc00000000000000000000001000000000001
naacl00000000001110010000000100000000005
paclic00000000000001000001000000000000002
ranlp00000000000000000000000000000000000
sem00000000000000000000000000100000001
speechc00000000000000000000000000000001001
tacl00000000000000000000000000000000000
tal00000000000000000000000000000000000
taln00000000000000000000000000000000000
taslp000000000000400500000000000100000010
tipster00000000000000000000000000000000202
trec10000000001000000100000000000000036
tot3000122020811225039015301185333000724 

Without symetric couples (i.e. after pruning)

 Matrix for: all types of copy&paste 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl2370281293712291333311919316018373010022541611002416529
acmtslp100000002000220500100000001000020016
alta302001301220000400400000010000000023
anlp600123301101000000300100100000002328
cath10012100000100000000000000000000017
cl900430302311010002800101303000000045
coling72103886119111511172771931601230409115951720001202425
conll2511112018833661902025101040270100200130
csal300003427011151034022000011012000100090
eacl14201529124313411050016100040140000501117
emnlp962212435127151016179391100518003011111201001905417
hlt8012052494111421426231527028041420615104500024612459
icassps13500033112546222610036400900101600053000616001502
ijcnlp26410032966115524440516301024210000500149
inlg800116420315001001400104000000010043
isca532002014450307822116152111387109891207036401211000613023977
jep000000000000000090000000000002000011
lre210001300001000202600001100000010021
lrec46301613715131016141710270049651206074564000503463
ltc300000000000010150135100000110400000071
modulad00000000000000000000000000000000000
mts130000292028939090220207074001000200111
muc3002024001060000000000111000000005136
naacl45100212428712112067211120310103042041000800243
paclic400010120111029030518802002170000010096
ranlp3200002342210700021940201040100000059
sem211000412113110110700013111010714520000001172
speechc00000110110041710500020000000010001300101
tacl11000000001000000020000000100000006
tal00000000000000005000000000000040009
taln0000000000000000300000000000052900064
taslp03000011131142070011800200102000170004900420
tipster3003006000150000002000131000000004038
trec1104112060221032830601100000100100000224295440
tot588841346444124271136171052193532114201181644171384586406326162633813030335413141442332 

 Matrix for: self-reuse 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl1561046729813752196177160919008112631412001604362
acmtslp00000000000010020000000000000000003
alta100000200110000100100000010000000311
anlp200022201101000100100100000000000115
cath00002100000000000010000000000000004
cl300110101210000001100000002000000014
coling336114348105625911321109181031494930001004244
conll1311101311601127140101400004034010020091
csal2000032230107101500000000000400030043
eacl92001177210580125001110007014000020187
emnlp3722002925103395616070390000510901001403199
hlt379000221738218107143360271091400330001215234
icassps6300000031135157105460050010410018000351011134
ijcnlp720002182311821234031300201320000030193
inlg300102310121000101100001000000010019
isca1511010101007237204941045307543020610153000261021487
jep00000000000000000000000000000300003
lre00000110000000010030000010000001008
lrec281001634204594203001837504064161000203213
ltc3000000000000001011220000100200000022
modulad00000000000000000000000000000000000
mts2000012000301101011100200100000010027
muc00000100000100000000002000000000004
naacl3170201314221122513215027108010031000502151
paclic000000300000020101600100540000010024
ranlp0100001000110000011300101010000000021
sem5200033400520300064000010232000000173
speechc0000010030006002500100000000000070043
tacl01000000000000000020000000100000004
tal00000000000000001000000000000030004
taln0000000000000000000000000000015400019
taslp0100000040137100470000000100010002700156
tipster00000000000000000000001000000000001
trec300200001148000100400003000000008112147
tot255555915228193521513518511376891191221166245140426613322888741877199143 

 Matrix for: self-plagiarism 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl71033642281812938166110011342060290080000814321
acmtslp100000002000120300100000001010020014
alta202001201130000300300000000000000018
anlp300101310001000000200000100000002318
cath10010100000100000000000000000000015
cl600320201110000001300000101000000022
coling4042732811812792551048053630611811110000201268
conll1200017724332071101210101016000000063
csal1000002040009001602200000001000090046
eacl5002413522122100300800003000000030056
emnlp6200121426161366175272602130030101180000601248
hlt4230522626834121112101402002140030100220001359276
icassps72000331943315102101690060000100003600031300768
ijcnlp1921001114411011121202530108002000020093
inlg500014210204011000300103000000000028
isca38901033502375179610179341102531006032400169000374002531
jep000000000000000090000000000001000010
lre210000200001000102300001000000000013
lrec24202594131299612823903428702034333100300274
ltc00000000000001014002380000110200000050
modulad00000000000000000000000000000000000
mts11000017202510280701920508310100010086
muc200201300106000000000081000000005130
naacl143001111551010934915016001020020000300115
paclic40001090111026020412702001630000000071
ranlp310000134210070001640100030100000038
sem180000498315904000781010612190000000106
speechc00000010800411002300100000000100070056
tacl10000000001000000000000000000000002
tal00000000000000004000000000000030007
taln0000000000000000300000000000037500045
taslp020000119001126005600200101000140002200236
tipster3003006000150000002000121000000002136
trec70492060116247304006000070110000216176283
tot3403094027192258734748111229313311251680316782914804021145331765229238877031197 

 Matrix for: reuse 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl100010120040120100000000002000000015
acmtslp00000000000000000000000000000000000
alta00000000000000000000000000000000000
anlp00000000000000000000000000000000000
cath00000000000000000000000000000000000
cl00000010000100000020010110000000007
coling00000000000110010000000110100000006
conll00000000000000000000000000100000001
csal00000000000010010000000000000000002
eacl00000000000000000000000000000000000
emnlp00000101011000000000010000000000027
hlt10000101001001000000000010000000028
icassps000000000000000800000000000100010010
ijcnlp00000000000000000000000000000000000
inlg00000000000000000000000000000000000
isca000000001001400400110001000000000013
jep00000000000000000000000000000000000
lre00000000000000000000000000000000000
lrec00000000000010000000000000000000001
ltc00000000000000000000000000000000000
modulad00000000000000000000000000000000000
mts00000000000000000000000000000000000
muc10000010000000000000000000000000002
naacl10000000000000000000000020000000003
paclic00000000000000000000000000000000000
ranlp00000000000000000000000000000000000
sem00000010000000000000000000000000001
speechc00000000000001020000000000000000003
tacl00000000000000000000000000000000000
tal00000000000000000000000000000000000
taln00000000000000000000000000000000000
taslp0000000011006001000000000000100000019
tipster00000000000000000000000000000000000
trec00000000000010010000000000000000046
tot400012442263154028003102035042000108 

 Matrix for: plagiarism 

To locate the number of pairs for a given CorpusCopy / CorpusPaste:

- Find the CorpusCopy in the left vertical header,

- Then find the CorpusPaste in the top horizontal header.

The crossing cell is the number of pairs.

 aclacmtslpaltaanlpcathclcolingconllcsaleaclemnlphlticasspsijcnlpinlgiscajeplrelrecltcmoduladmtsmucnaaclpaclicranlpsemspeechctacltaltalntaslptipstertrectot
acl000001100001010100110001111000000011
acmtslp00000000000000000000000000000000000
alta00000000000000000000000000000000000
anlp10000000000000000000000000000000001
cath00000000000000000000000000000000000
cl00000000000001000020000010000000004
coling00001000000101010020000100100000008
conll00000000002000000000000000000000002
csal00000000000100020000000110000000005
eacl00000000000000000000000000000000000
emnlp00000000000100000000000100000000002
hlt10000000001100020001010000000000007
icassps0000000010122001800000002000100020029
ijcnlp00000000000000000000000011000000002
inlg00000000000000000000000000000000000
isca0000011000001410300000000000000030023
jep00000000000000000000000000000000000
lre00000000000000000000000000000000000
lrec00000000102000010000000001010000017
ltc00000000000000000000000000000000000
modulad00000000000000000000000000000000000
mts00000000000000010000000010000000002
muc00000000000000000000001000000000001
naacl00000000001100010000000100000000004
paclic00000000000001000001000000000000002
ranlp00000000000000000000000000000000000
sem00000000000000000000000000100000001
speechc00000000000000000000000000000000000
tacl00000000000000000000000000000000000
tal00000000000000000000000000000000000
taln00000000000000000000000000000000000
taslp000000000000400500000000000100000010
tipster00000000000000000000000000000000202
trec10000000001000000100000000000000036
tot300012202088205035015301175333000524 

total elapsed time (read and display included)= 0.20696666666666666 minutes with 8 cores

computed on: Fri Sep 09 19:58:43 CEST 2016