# 111015 tokyo scipy2_discussionquestionaire_i_python

Chief Research Officer um Preferred Networks America, Inc.
14. Oct 2011
1 von 6

### 111015 tokyo scipy2_discussionquestionaire_i_python

• 1. Q3: Sum with NaN and Inf nanやInfを含む値の列x = numpy.array([[1,0,nan,1,Inf,1,....]])が与えられたとき、NaNやInf以外のx の要素の合計を計算す る方法が直ぐに思い浮かびますか？ In [1]: # 準備 i m p o r t numpy i m p o r t numpy a s np x = numpy.array([[1,0,nan,1,Inf,1]]) x Out[1]: array([[ 1., 0., nan, 1., inf, 1.]]) In [2]: # 回答者1 x[isnan(x)]=0 x[isinf(x)]=0 sum(x) Out[2]: 3.0 In [3]: % timeit x[isnan(x)]=0 % timeit x[isinf(x)]=0 % timeit sum(x) 100000 loops, best of 3: 4.47 us per loop 100000 loops, best of 3: 4.31 us per loop 100000 loops, best of 3: 3.46 us per loop In [4]: # 回答者2 # どういうときに Inf を除きたいのかわからないけど sum_with_finite = x[numpy.isfinite(x)].sum() sum_with_finite # NaN を除くだけなら sum_without_nan = numpy.nansum(x) Out[4]: 3.0 In [5]: % timeit x[numpy.isfinite(x)].sum() 100000 loops, best of 3: 7.47 us per loop In [6]: # 回答者4 sum(filter(l a m b d a x: x! = float('inf') a n d x= = x, x[0])) l ! = Out[6]: 3.0 In [7]: % timeit sum(filter(l a m b d a x: x! = float('inf') a n d x= = x, x[0])) l ! = 10000 loops, best of 3: 39 us per loop In [8]: # 回答者5 np.nansum(x[x! = np.inf]) ! Out[8]: 3.0
• 2. Out[8]: 3.0 In [9]: % timeit np.nansum(x[x! = np.inf]) ! 10000 loops, best of 3: 29.3 us per loop In [10]: # 回答者6 x[numpy.isfinite(x)].sum() Out[10]: 3.0 In [11]: % timeit x[numpy.isfinite(x)].sum() 100000 loops, best of 3: 8.33 us per loop In [12]: # 回答者8 numpy.sum(x[numpy.isfinite(x)]) Out[12]: 3.0 In [13]: % timeit numpy.sum(x[numpy.isfinite(x)]) 100000 loops, best of 3: 10.2 us per loop In [14]: # 回答者9 np.sum(x[np.isfinite(x)]) Out[14]: 3.0 In [15]: % timeit np.sum(x[np.isfinite(x)]) 100000 loops, best of 3: 10.1 us per loop Q4: Missing values in ndarray nanを含む4x2行列m = numpy.array([[1,nan,-1,0],[0,0,nan,1]])が与えられたとき、nanを含む行を削除して2x2行列にする 方法が直ぐに思い浮かびますか？ In [16]: # 準備 i m p o r t numpy i m p o r t numpy a s np m = numpy.array([[1,nan,- 1,0],[0,0,nan,1]]) - m Out[16]: array([[ 1., nan, -1., 0.], [ 0., 0., nan, 1.]]) In [17]: # 回答者1 delete(m,argmax(isnan(m),axis=1),axis=1) Out[17]: array([[ 1., 0.], [ 0., 1.]]) In [18]: % timeit delete(m,argmax(isnan(m),axis=1),axis=1)
• 3. In [18]: % timeit delete(m,argmax(isnan(m),axis=1),axis=1) 10000 loops, best of 3: 110 us per loop In [19]: # 回答者2 selected_m = m[:,numpy.isfinite(m.sum(axis=0))] selected_m Out[19]: array([[ 1., 0.], [ 0., 1.]]) In [20]: % timeit m[:,numpy.isfinite(m.sum(axis=0))] 100000 loops, best of 3: 11.2 us per loop In [21]: # 回答者4 index = [xx f o r xx i n range(len(m[0])) i f sum(m[:,xx])= = sum(m[:,xx])] = m[:,index] Out[21]: array([[ 1., 0.], [ 0., 1.]]) In [22]: % timeit index = [xx f o r xx i n range(len(m[0])) i f sum(m[:,xx])= = sum(m[:,xx])] = % timeit m[:,index] 10000 loops, best of 3: 38.9 us per loop 100000 loops, best of 3: 12.8 us per loop In [23]: # 回答者6 nans = logical_or(isnan(m[0]), isnan(m[1])) mask = tile(logical_not(nans), (2,1)) res = m[mask].reshape(2,2) res Out[23]: array([[ 1., 0.], [ 0., 1.]]) In [24]: % timeit nans = logical_or(isnan(m[0]), isnan(m[1])) % timeit mask = tile(logical_not(nans), (2,1)) % timeit res = m[mask].reshape(2,2) 100000 loops, best of 3: 5.46 us per loop 100000 loops, best of 3: 13.2 us per loop 100000 loops, best of 3: 4.46 us per loop In [25]: # 回答者8 m[:,numpy.apply_along_axis(numpy.all,0,numpy.isfinite(m))] Out[25]: array([[ 1., 0.], [ 0., 1.]]) In [26]: % timeit m[:,numpy.apply_along_axis(numpy.all,0,numpy.isfinite(m))] 1000 loops, best of 3: 160 us per loop In [27]: # 回答者9 m[:, np.isfinite(np.sum(m, axis=0))] Out[27]: array([[ 1., 0.],
• 4. Out[27]: array([[ 1., 0.], [ 0., 1.]]) In [28]: % timeit m[:, np.isfinite(np.sum(m, axis=0))] 100000 loops, best of 3: 12.8 us per loop Q5: 1-of-K representation numpy.array([[1,3,2]])を、1-of-K表記法変換してnumpy.array([[1,0,0],[0,0,1],[0,1,0]])にする処理方法が直ぐに思い浮か びますか？ In [29]: # 準備 i m p o r t numpy i m p o r t numpy a s np y = numpy.array([[1,3,2]]) y Out[29]: array([[1, 3, 2]]) In [30]: #回答者1 t = numpy.array([1,3,2]) # pattern 1 z = numpy.fromfunction(l a m b d a i,j:j= = t[i]- 1,(t.size,t.max()),dtype=int)+ 0 l = - + p r i n t (z) # pattern 2 z = numpy.array([numpy.identity(t.max())[x- 1,:] f o r x i n t]) - p r i n t (z) # pattern 3(numpy 1.6 以降) z = numpy.array([numpy.bincount([x- 1],minlength=t.max()) f o r x i n t]) - p r i n t (z) [[1 0 0] [0 0 1] [0 1 0]] [[ 1. 0. 0.] [ 0. 0. 1.] [ 0. 1. 0.]] [[1 0 0] [0 0 1] [0 1 0]] In [31]: % timeit z = numpy.fromfunction(l a m b d a i,j:j= = t[i]- 1,(t.size,t.max()),dtype=int)+ 0 l = - + % timeit z = numpy.array([numpy.identity(t.max())[x- 1,:] f o r x i n t]) - % timeit z = numpy.array([numpy.bincount([x- 1],minlength=t.max()) f o r x i n t]) - 10000 loops, best of 3: 61.7 us per loop 10000 loops, best of 3: 55.3 us per loop 10000 loops, best of 3: 65.6 us per loop In [32]: #回答者2 N=y.shape[1] yy=zeros(N* * 2) * yy[N* arange(N)+ y- 1]=1 #編集者注：yy[N*arange(N)+y-1].reshape(N,N)では動かず * + - yy.reshape(N,N)
• 5. Out[32]: array([[ 1., 0., 0.], [ 0., 0., 1.], [ 0., 1., 0.]]) In [33]: % timeit N=y.shape[1] % timeit yy=zeros(N* * 2) * % timeit yy[N* arange(N)+ y- 1]=1 #編集者注：yy[N*arange(N)+y-1].reshape(N,N)では動かず * + - % timeit yy.reshape(N,N) 10000000 loops, best of 3: 203 ns per loop 1000000 loops, best of 3: 1.37 us per loop 10000 loops, best of 3: 14.6 us per loop 1000000 loops, best of 3: 846 ns per loop In [34]: #回答者4 K=3 d e f my_func(i): z = numpy.zeros(K,dtype=int) z[i- 1] = 1 - return z numpy.array(map(my_func,y[0])) Out[34]: array([[1, 0, 0], [0, 0, 1], [0, 1, 0]]) In [35]: % timeit numpy.array(map(my_func,y[0])) 10000 loops, best of 3: 29.8 us per loop In [36]: #回答者6 res = zeros((3, 3)) indices = [i* 3+ c- 1 f o r i, c i n enumerate(y[0])] * + - res.put(indices, 1) res Out[36]: array([[ 1., 0., 0.], [ 0., 0., 1.], [ 0., 1., 0.]]) In [37]: % timeit res = zeros((3, 3)) % timeit indices = [i* 3+ c- 1 f o r i, c i n enumerate(y[0])] * + - % timeit res.put(indices, 1) 1000000 loops, best of 3: 814 ns per loop 100000 loops, best of 3: 10.2 us per loop 100000 loops, best of 3: 10.3 us per loop In [38]: #回答者8 numpy.fromfunction(l a m b d a i, j: numpy.array(y[0][i]= = j+ 1, dtype=int), (3, 3), dtype l = + Out[38]: array([[1, 0, 0], [0, 0, 1], [0, 1, 0]])
• 6. In [39]: % timeit numpy.fromfunction(l a m b d a i, j: numpy.array(y[0][i]= = j+ 1, dtype=int), (3, 3 l = + 10000 loops, best of 3: 44.3 us per loop In [40]: #回答者9 #これは逆の方が問題だな… ans = np.zeros((3, 3)) ans[np.arange(3, dtype=np.int), y- 1] = 1 #編集者注：0-origin対応でy-1とした - ans Out[40]: array([[ 1., 0., 0.], [ 0., 0., 1.], [ 0., 1., 0.]]) In [41]: % timeit ans = np.zeros((3, 3)) % timeit ans[np.arange(3, dtype=np.int), y- 1] = 1 - 1000000 loops, best of 3: 761 ns per loop 100000 loops, best of 3: 7.97 us per loop Q6: Useful snippets In [45]: d e f _main(): pass i f __name__= = _main(): = _main()