文章目录
  1. 1. 阿里巴巴大数据竞赛第二阶段所写代码统计

阿里巴巴大数据竞赛第二阶段所写代码统计

比赛告一段落,想把之前写得代码行数统计一下。找到一篇文章,这篇统计了所有文件所有行数之和;然而我的代码复用度很高,所以我稍微修改添加了几行。从而实现了统计所有行数和去重之后的行数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import sys,os

extens = [".py"]
linesCount = 0
filesCount = 0
s = set()

def funCount(dirName):
    global extens,linesCount,filesCount
    for root,dirs,fileNames in os.walk(dirName):
        for f in fileNames:
            fname = os.path.join(root,f)
            try :
                ext = f[f.rindex('.'):]
                if(extens.count(ext) > 0):
                   # print 'support'
                    filesCount += 1
                    print fname
                    l_count = len(open(fname).readlines())
                    print fname," : ",l_count
                    linesCount += l_count
                    for line in open(fname).readlines():
                        s.add(line)
               # else:
                #    print ext," : not support"
            except:
               # print "Error occur!"
                pass


if len(sys.argv) > 1 :
    for m_dir in sys.argv[1:]:        
        print m_dir
        funCount(m_dir)
else :
    funCount(".")        
    
print "files count : ",filesCount
print "lines count : ",linesCount
print "remove repeat lines: ", len(s) 

raw_input("Press Enter to continue")

最后统计结果:文件256个,总行数20384,去重之后2263

顺便统计下第一阶段的代码总数:

1
2
3
files count :  144
lines count :  22411
remove repeat lines:  3187

由于虚拟机中的文件没办法拷贝到本地,所以只好截个图,留个纪念吧。






第一阶段的文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
.\calLines.py
.\calLines.py  :  42
.\3AddData\mergeFile.py
.\3AddData\mergeFile.py  :  121
.\addTimeFactor\test.py
.\addTimeFactor\test.py  :  23
.\addTimeFactor\textPreprocess.py
.\addTimeFactor\textPreprocess.py  :  417
.\addTimeFactor\timeF_split\timeF.py
.\addTimeFactor\timeF_split\timeF.py  :  455
.\availableResult\lastResult\ali323.py
.\availableResult\lastResult\ali323.py  :  137
.\availableResult\lastResult\delNotPurchasedBrandInDemo2.py
.\availableResult\lastResult\delNotPurchasedBrandInDemo2.py  :  342
.\availableResult\lastResult\mergeFile.py
.\availableResult\lastResult\mergeFile.py  :  163
.\brainstorming\black2tab\b2tab.py
.\brainstorming\black2tab\b2tab.py  :  11
.\brainstorming\calParameter\calParameter.py
.\brainstorming\calParameter\calParameter.py  :  56
.\brainstorming\delUserBrandOfNoPurchase\delNotPurchasedBrandInDemo2.py
.\brainstorming\delUserBrandOfNoPurchase\delNotPurchasedBrandInDemo2.py  :  215
.\brainstorming\findUserBrandNumInMap\findUserBrandNumInMap.py
.\brainstorming\findUserBrandNumInMap\findUserBrandNumInMap.py  :  56
.\brainstorming\loveBrand\userLoveBrand.py
.\brainstorming\loveBrand\userLoveBrand.py  :  250
.\brainstorming\mergeFile\mergeFile.py
.\brainstorming\mergeFile\mergeFile.py  :  217
.\brainstorming\mergeFile\test\mergeFile.py
.\brainstorming\mergeFile\test\mergeFile.py  :  183
.\brainstorming\selBrandAboutTwiceActionInDiffTime\selBrandWhichMultiActionInDiffTime.py
.\brainstorming\selBrandAboutTwiceActionInDiffTime\selBrandWhichMultiActionInDiffTime.py  :  85
.\brainstorming\user-brands_dict\users2brands_dict.py
.\brainstorming\user-brands_dict\users2brands_dict.py  :  20
.\Ch05\logRegres.py
.\Ch05\logRegres.py  :  130
.\Ch05\EXTRAS\plot2D.py
.\Ch05\EXTRAS\plot2D.py  :  44
.\Ch05\EXTRAS\plotGD.py
.\Ch05\EXTRAS\plotGD.py  :  52
.\Ch05\EXTRAS\plotSDerror.py
.\Ch05\EXTRAS\plotSDerror.py  :  69
.\Ch05\EXTRAS\sigmoidPlot.py
.\Ch05\EXTRAS\sigmoidPlot.py  :  21
.\corporatMerge\getImplement\mergeFile.py
.\corporatMerge\getImplement\mergeFile.py  :  167
.\finalProgram\ruleMethod\add_del_manual.py
.\finalProgram\ruleMethod\add_del_manual.py  :  51
.\finalProgram\ruleMethod\getRawDataByRule.py
.\finalProgram\ruleMethod\getRawDataByRule.py  :  137
.\finalProgram\ruleMethod\mainProcess_AddTimeFactor.py
.\finalProgram\ruleMethod\mainProcess_AddTimeFactor.py  :  454
.\finalProgram\ruleMethod\mergeFile.py
.\finalProgram\ruleMethod\mergeFile.py  :  101
.\finalProgram\ruleMethod\otherFunctions.py
.\finalProgram\ruleMethod\otherFunctions.py  :  342
.\finalProgram\ruleMethod\predict.py
.\finalProgram\ruleMethod\predict.py  :  13
.\finalProgram\ruleMethod\preprocessingData.py
.\finalProgram\ruleMethod\preprocessingData.py  :  52
.\finalProgram\svd\continueNum2RawNum.py
.\finalProgram\svd\continueNum2RawNum.py  :  82
.\finalProgram\svd\creatPredictList.py
.\finalProgram\svd\creatPredictList.py  :  145
.\finalProgram\svd\PredictNewBrand.py
.\finalProgram\svd\PredictNewBrand.py  :  131
.\finalProgram\svd\SVD.py
.\finalProgram\svd\SVD.py  :  204
.\forManualSelect\jyc_3b_50\selectFromUser4Brand.py
.\forManualSelect\jyc_3b_50\selectFromUser4Brand.py  :  49
.\forManualSelect\jyc_4b_50_2\selectFromUser4Brand.py
.\forManualSelect\jyc_4b_50_2\selectFromUser4Brand.py  :  49
.\forManualSelect\jyc_4b_50_2\selectFromUser4Brand_Fix.py
.\forManualSelect\jyc_4b_50_2\selectFromUser4Brand_Fix.py  :  43
.\forManualSelect\jyc_8-12b_50\selectFromUser4Brand.py
.\forManualSelect\jyc_8-12b_50\selectFromUser4Brand.py  :  49
.\forManualSelect\jyc_8-12b_50\selectFromUser4Brand_Fix.py
.\forManualSelect\jyc_8-12b_50\selectFromUser4Brand_Fix.py  :  43
.\forManualSelect\long_5b_50\selectFromUser4Brand.py
.\forManualSelect\long_5b_50\selectFromUser4Brand.py  :  29
.\forManualSelect\tianyiAPI\ali323.py
.\forManualSelect\tianyiAPI\ali323.py  :  137
.\forManualSelect\tianyiAPI\calBrand.py
.\forManualSelect\tianyiAPI\calBrand.py  :  23
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo.py
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo.py  :  152
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo2.py
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo2.py  :  342
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo329.py
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemo329.py  :  375
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemoTXT.py
.\forManualSelect\tianyiAPI\delNotPurchasedBrandInDemoTXT.py  :  152
.\forManualSelect\tianyiAPI\test.py
.\forManualSelect\tianyiAPI\test.py  :  4
.\forManualSelect\tl_7b_50\selectFromUser4Brand.py
.\forManualSelect\tl_7b_50\selectFromUser4Brand.py  :  29
.\forManualSelect\zyh_6b_50\selectFromUser4Brand.py
.\forManualSelect\zyh_6b_50\selectFromUser4Brand.py  :  24
.\hitNone\mergeFile.py
.\hitNone\mergeFile.py  :  191
.\Long\mergeFile.py
.\Long\mergeFile.py  :  94
.\Long\hit\mergeFile.py
.\Long\hit\mergeFile.py  :  65
.\LR\oliMethod\test.py
.\LR\oliMethod\test.py  :  18
.\newScoreMethod\ali323.py
.\newScoreMethod\ali323.py  :  137
.\newScoreMethod\newScoreMethod.py
.\newScoreMethod\newScoreMethod.py  :  213
.\operationOfEveryday\4-10_1_splitBestToPicees\splitToPicees.py
.\operationOfEveryday\4-10_1_splitBestToPicees\splitToPicees.py  :  287
.\operationOfEveryday\4-10_1_splitBestToPicees\test.py
.\operationOfEveryday\4-10_1_splitBestToPicees\test.py  :  152
.\operationOfEveryday\4-10_2_PredictNewBrand\PredictNewBrand.py
.\operationOfEveryday\4-10_2_PredictNewBrand\PredictNewBrand.py  :  126
.\operationOfEveryday\4-10_2_PredictNewBrand\testTheResult\mergeFile.py
.\operationOfEveryday\4-10_2_PredictNewBrand\testTheResult\mergeFile.py  :  211
.\operationOfEveryday\4-11_MergeLongResult-6.3\mergeFile.py
.\operationOfEveryday\4-11_MergeLongResult-6.3\mergeFile.py  :  180
.\operationOfEveryday\4-11_MergeLongResult-6.3\selBrandWhichMultiActionInDiffTime.py
.\operationOfEveryday\4-11_MergeLongResult-6.3\selBrandWhichMultiActionInDiffTime.py  :  87
.\operationOfEveryday\4-11_selBrandAboutTwiceActionInDiffTime\selBrandWhichMultiActionInDiffTime.py
.\operationOfEveryday\4-11_selBrandAboutTwiceActionInDiffTime\selBrandWhichMultiActionInDiffTime.py  :  85
.\operationOfEveryday\4-12\mergeFile_412.py
.\operationOfEveryday\4-12\mergeFile_412.py  :  174
.\operationOfEveryday\4-12_merge_selcMultiBrand\mergeFile_412.py
.\operationOfEveryday\4-12_merge_selcMultiBrand\mergeFile_412.py  :  171
.\operationOfEveryday\4-12_merge_selcMultiBrand\selBrandWhichMultiActionInDiffTime_412.py
.\operationOfEveryday\4-12_merge_selcMultiBrand\selBrandWhichMultiActionInDiffTime_412.py  :  80
.\operationOfEveryday\4-13_delTheLastOnes\textPreprocess.py
.\operationOfEveryday\4-13_delTheLastOnes\textPreprocess.py  :  47
.\operationOfEveryday\4-13_selectFromUser4Brand\selectFromUser4Brand.py
.\operationOfEveryday\4-13_selectFromUser4Brand\selectFromUser4Brand.py  :  8
.\operationOfEveryday\4-15_merge_longBro\mergeFile.py
.\operationOfEveryday\4-15_merge_longBro\mergeFile.py  :  167
.\operationOfEveryday\4-16_merge_6.89\mergeFile.py
.\operationOfEveryday\4-16_merge_6.89\mergeFile.py  :  178
.\operationOfEveryday\4-16_merge_6.89_proProcess\mergeFile.py
.\operationOfEveryday\4-16_merge_6.89_proProcess\mergeFile.py  :  175
.\operationOfEveryday\4-16_merge_forSelect\mergeFile.py
.\operationOfEveryday\4-16_merge_forSelect\mergeFile.py  :  175
.\operationOfEveryday\4-17_divideWork\toAdd\mergeFile.py
.\operationOfEveryday\4-17_divideWork\toAdd\mergeFile.py  :  158
.\operationOfEveryday\4-17_divideWork\toMinus\mergeFile.py
.\operationOfEveryday\4-17_divideWork\toMinus\mergeFile.py  :  178
.\operationOfEveryday\4-17_divideWork\toPiece\splitToPicees.py
.\operationOfEveryday\4-17_divideWork\toPiece\splitToPicees.py  :  288
.\operationOfEveryday\4-18_giveLongToMinus\mergeFile.py
.\operationOfEveryday\4-18_giveLongToMinus\mergeFile.py  :  115
.\operationOfEveryday\4-18_submintFile2disoder\mergeFile.py
.\operationOfEveryday\4-18_submintFile2disoder\mergeFile.py  :  160
.\operationOfEveryday\4-4\textPreprocess.py
.\operationOfEveryday\4-4\textPreprocess.py  :  210
.\operationOfEveryday\4-4\methodOfCSDN\textPreprocess.py
.\operationOfEveryday\4-4\methodOfCSDN\textPreprocess.py  :  416
.\operationOfEveryday\4-5\methodOfCSDN__3500\textPreprocess.py
.\operationOfEveryday\4-5\methodOfCSDN__3500\textPreprocess.py  :  403
.\operationOfEveryday\4-6\anlysisTheBestResult\mergeFile-4_6.py
.\operationOfEveryday\4-6\anlysisTheBestResult\mergeFile-4_6.py  :  160
.\operationOfEveryday\4-6\methodOfCSDN\textPreprocess.py
.\operationOfEveryday\4-6\methodOfCSDN\textPreprocess.py  :  403
.\operationOfEveryday\4-8\delHalfBrand.py
.\operationOfEveryday\4-8\delHalfBrand.py  :  57
.\operationOfEveryday\4-8\mergeFile.py
.\operationOfEveryday\4-8\mergeFile.py  :  163
.\operationOfEveryday\4-9_splitBestToPicees\splitToPicees.py
.\operationOfEveryday\4-9_splitBestToPicees\splitToPicees.py  :  225
.\operationOfEveryday\4-9_splitBestToPicees\splitToPiceesSthWrong.py
.\operationOfEveryday\4-9_splitBestToPicees\splitToPiceesSthWrong.py  :  152
.\operationOfEveryday\4-9_splitBestToPicees\test.py
.\operationOfEveryday\4-9_splitBestToPicees\test.py  :  152
.\otherMethod\trainTest.py
.\otherMethod\trainTest.py  :  150
.\resultOfLong\tianyiAPI\ali323.py
.\resultOfLong\tianyiAPI\ali323.py  :  137
.\resultOfLong\tianyiAPI\calBrand.py
.\resultOfLong\tianyiAPI\calBrand.py  :  23
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo.py
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo.py  :  152
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo2.py
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo2.py  :  342
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo329.py
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemo329.py  :  375
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemoTXT.py
.\resultOfLong\tianyiAPI\delNotPurchasedBrandInDemoTXT.py  :  152
.\resultOfLong\tianyiAPI\test.py
.\resultOfLong\tianyiAPI\test.py  :  4
.\resultOfLong\timeF_split\timeF.py
.\resultOfLong\timeF_split\timeF.py  :  448
.\resultProcess\jyc\mergeFile.py
.\resultProcess\jyc\mergeFile.py  :  147
.\resultProcess\jyc\forR_24\mergeFile.py
.\resultProcess\jyc\forR_24\mergeFile.py  :  117
.\resultProcess\jyc\getResult_4-19\getResult_4-19.py
.\resultProcess\jyc\getResult_4-19\getResult_4-19.py  :  164
.\resultProcess\long\mergeFile.py
.\resultProcess\long\mergeFile.py  :  205
.\resultProcess\testResult\mergeFile.py
.\resultProcess\testResult\mergeFile.py  :  126
.\resultProcess\wangning\mergeFile.py
.\resultProcess\wangning\mergeFile.py  :  157
.\selectTask\add_518_2.py
.\selectTask\add_518_2.py  :  314
.\selectTask\ali323.py
.\selectTask\ali323.py  :  137
.\selectTask\selectAuto.py
.\selectTask\selectAuto.py  :  73
.\selectTask\selectFromUser4Brand.py
.\selectTask\selectFromUser4Brand.py  :  49
.\selectTask\sortedScore.py
.\selectTask\sortedScore.py  :  0
.\selectTask\analysis\mergeFileAnalysis.py
.\selectTask\analysis\mergeFileAnalysis.py  :  210
.\selectTask\selectHighScore\mergeFile.py
.\selectTask\selectHighScore\mergeFile.py  :  159
.\selectTask\tianyiAPI\ali323.py
.\selectTask\tianyiAPI\ali323.py  :  137
.\selectTask\tianyiAPI\calBrand.py
.\selectTask\tianyiAPI\calBrand.py  :  23
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo.py
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo.py  :  152
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo2.py
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo2.py  :  342
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo329.py
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemo329.py  :  375
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemoTXT.py
.\selectTask\tianyiAPI\delNotPurchasedBrandInDemoTXT.py  :  152
.\selectTask\tianyiAPI\test.py
.\selectTask\tianyiAPI\test.py  :  4
.\submitInAli\calc.py
.\submitInAli\calc.py  :  74
.\submitInAli\numOfSubmit.py
.\submitInAli\numOfSubmit.py  :  117
.\SVD\SVD.py
.\SVD\SVD.py  :  204
.\SVD\continueNum2RawNum\continueNum2RawNum.py
.\SVD\continueNum2RawNum\continueNum2RawNum.py  :  82
.\SVD\creatPredictList\creatPredictList.py
.\SVD\creatPredictList\creatPredictList.py  :  145
.\SVD\creatRawData\creatRawData.py
.\SVD\creatRawData\creatRawData.py  :  258
.\SVD\predictNewBrand\PredictNewBrand.py
.\SVD\predictNewBrand\PredictNewBrand.py  :  131
.\test\mergeFileTest.py
.\test\mergeFileTest.py  :  166
.\test\spider.py
.\test\spider.py  :  39
.\textPreprocess\test.py
.\textPreprocess\test.py  :  55
.\textPreprocess\textPreprocess.py
.\textPreprocess\textPreprocess.py  :  371
.\textPreprocess\textPreprocess_DelLowScore.py
.\textPreprocess\textPreprocess_DelLowScore.py  :  303
.\tianyiAPI\ali323.py
.\tianyiAPI\ali323.py  :  137
.\tianyiAPI\calBrand.py
.\tianyiAPI\calBrand.py  :  23
.\tianyiAPI\delNotPurchasedBrandInDemo.py
.\tianyiAPI\delNotPurchasedBrandInDemo.py  :  152
.\tianyiAPI\delNotPurchasedBrandInDemo2.py
.\tianyiAPI\delNotPurchasedBrandInDemo2.py  :  342
.\tianyiAPI\delNotPurchasedBrandInDemo329.py
.\tianyiAPI\delNotPurchasedBrandInDemo329.py  :  375
.\tianyiAPI\delNotPurchasedBrandInDemoTXT.py
.\tianyiAPI\delNotPurchasedBrandInDemoTXT.py  :  152
.\tianyiAPI\test.py
.\tianyiAPI\test.py  :  4
.\timeF_split\timeF.py
.\timeF_split\timeF.py  :  432
.\timeF_split\tianyiAPI\ali323.py
.\timeF_split\tianyiAPI\ali323.py  :  137
.\timeF_split\tianyiAPI\calBrand.py
.\timeF_split\tianyiAPI\calBrand.py  :  23
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo.py
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo.py  :  152
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo2.py
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo2.py  :  342
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo329.py
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemo329.py  :  375
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemoTXT.py
.\timeF_split\tianyiAPI\delNotPurchasedBrandInDemoTXT.py  :  152
.\timeF_split\tianyiAPI\test.py
.\timeF_split\tianyiAPI\test.py  :  4
.\viewFig\figBrandOverTime.py
.\viewFig\figBrandOverTime.py  :  24
.\Wangning\mergeFile.py
.\Wangning\mergeFile.py  :  65
files count :  144
lines count :  22411
remove repeat lines:  3187
Press Enter to continue
文章目录
  1. 1. 阿里巴巴大数据竞赛第二阶段所写代码统计