-
Notifications
You must be signed in to change notification settings - Fork 0
/
command.txt
496 lines (390 loc) · 32.1 KB
/
command.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
# 现病史分块_3.1.0: 有些直接从k8s中找到的镜像,直接启动报错 'NoneType' object has no attribute 'model_checkpoint_path' 原因是模型文件没有放到正确位置
# -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models /ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0下没有模型文件
docker rm -f pre_his_chunk_3.1.0
# 没有挂载,可以直接用 说明这个镜像是你后来自己封装的 把需要的模型文件加进去了
#docker run -p 11000:8080 -p 11001:8081 -itd --runtime=nvidia --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 python app.py # 可以直接用
#docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
#docker run -it -p 11000:8080 -p 11001:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 /bin/bash
# 可以运行
#docker run -it -p 11000:8080 -p 11001:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 /bin/bash
docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
#docker inspect pre_his_chunk_3.1.0
#docker run -itd -p 11040:8080 -p 11041:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name per_his_v3.2.5 dockerdist.bdmd.com/per-his:v3.2.5 python app.py
#/home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0
#docker run -p 11000:8080 -itd --runtime=nvidia --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 /bin/bash
#docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 python app.py
#docker run -itd -p 11000:8080 -p 11001:8081 --gpus=all -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
#docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
#docker run -p 11000:8080 -p 11001:8081 -it --runtime=nvidia --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 /bin/bash
#/home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0
#/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0
# ######################################
# docker run时最后不加任何指令,则默认使用构建镜像时指定的指令(例如CMD python3 app.py) 当我们使用docker run -v /host/something:/container/something尝试挂载默写目录到容器内时,期待的效果是容器内的文件和本地目录的文件相互映射,但实际效果宿主机目录会直接覆盖掉容器内的目录
docker run -itd -p 12570:8080 -p 12571:8081 -p 1122:22 --name convert_to_triple -v /home/machaoyang/Projects/demonstrate:/demonstrate dockerdist.bdmd.com/convert_to_triple:v1.0526
# docker_CPM-2
docker run -p 303:22 -it --name prompt_learning_v3 gyxthu17/cpm-2:1.1 /bin/bash
docker run -p 15001:8080 -p 301:22 -it --name prompt_learning_v1 gyxthu17/cpm-2:1.1 /bin/bash
docker run -p 410:22 -it --gpus=all --name prompt_learning_v5 gyxthu17/cpm-2:1.1 /bin/bash
docker run -p 420:22 -it --gpus=all –-shm-size 8g --name prompt_learning_v5 gyxthu17/cpm-2:1.1 /bin/bash
docker run -p 415:22 -it --gpus=all --name prompt_learning_v5 -v /data/machaoyang/cpm-2-finetune:/cpm-2-finetune gyxthu17/cpm-2:1.1 /bin/bash
docker run -p 417:22 -it --gpus=all --name prompt_learning_v7 -v /data/machaoyang/cpm-2-finetune:/cpm-2-finetune gyxthu17/cpm-2:1.1_machaoyang /bin/bash
docker run -p 418:22 -it --gpus=all --name prompt_learning_v8 -v /data/machaoyang/cpm-2-finetune_encoder:/cpm-2-finetune gyxthu17/cpm-2:1.1_machaoyang /bin/bash
docker run -p 418:22 -it --gpus=all --name prompt_learning_v8 -v /data/machaoyang/cpm-2-finetune:/cpm-2-finetune gyxthu17/cpm-2:1.1_machaoyang /bin/bash
docker run -it --gpus=all --name prompt_learning_v4 -v /data/machaoyang/cpm-2-finetune:/cpm-2-finetune gyxthu17/cpm-2:1.1_machaoyang /bin/bash
# GPU启动 docker 19.03之前的是--runtime=nvidia 19.03和之后为--gpus all
docker run -p 420:22 -it --runtime=nvidia --name prompt_learning_v11 gyxthu17/cpm-2:1.2_v1 /bin/bash
docker run -p 420:22 -it --runtime=nvidia --name prompt_learning_v80 cmp2_v2 /bin/bash
docker run -p 411:22 -it --runtime=nvidia --name prompt_learning_v11 gyxthu17/cpm-2:1.2_v1 /bin/bash
# linux查看系统信息
source ~/.bashrc
nvcc -V
conda install cudatoolkit=10.1
conda update --force conda
cat /usr/local/version.txt
linux查看版本当前操作系统发行信息 cat /etc/issue
Linux查看cpu相关信息,包括型号、主频、内核信息等 cat /etc/cpuinfo
# 即使model下没有ai-doctor-assistant这个目录,也会创建,并且直接将源目录下的所有文件直接复制到目标目录,而不会在ai-doctor-assistant下再创建一个ai-doctor-assistant
cp -r ai-doctor-assistant/ ../Projects/model/ai-doctor-assistant/
# 查看某个目录总共占的磁盘容量
du -sh 目录
# 查看/目录下的所有文件和文件夹大小,找出所有GB大小的文件,并从大到小排序
du -h / | grep G | sort -nr
# 查看磁盘空间
df -hl
# 查看端口被哪个进程占用
netstat -tunlp | grep 2181
根据端口占用的进程:
(base) htht@htht:~/mcy_frp/frp_0.32.1_linux_amd64$ lsof -i:7000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
frpc 5904 htht 6u IPv4 75992 0t0 TCP bogon:60230->8.130.126.131:bbs (ESTABLISHED)
(base) htht@htht:~/mcy_frp/frp_0.32.1_linux_amd64$
根据PID/COMMAND查看启动的命令文件
ps -ef|grep PID/COMMAND
root@iZ0jl16snshllvywzyv044Z:~# ps -ef|grep 76140
root 76140 1 0 Jun05 ? 00:13:44 ./frps -c ./frps.ini
进程的启动时间和运行时间:
ps -o lstart,etime -p 76140
# nohup
nohup python train.py >> train_log.log 2>&1 & 追加到文件
nohup python train.py > train_log.log 2>&1 & 清空之前文件内容新加
nohup python spert.py train --config configs/example_train.conf > train_log.log 2>&1 &
CUDA_VISIBLE_DEVICES=1 python spert.py train --config configs/example_train.conf > train_log.log 2>&1 &
nohup ./启动.sh > log/不良反应_token_cls_index_no_blank_1.log 2>&1 &
nohup ./start_frp_client.sh > log1.log 2>&1 &
# tensorboard
tensorboard –-logdir /Users/machaoyang/Downloads/cpm-2-finetune/result/
http://localhost:6006
# 本地文件 容器 互传文件 这里传到根目录了
docker cp ../cpm-2-finetune prompt_learning_v8:/
docker cp prompt_learning_v8:/cpm-2-finetune ./
# 启动容器 容器id或者容器名字
docker start d0457afeeb1c
# 进入容器 容器id或者容器名字 指定gpu是在创建容器的时候,进入的话不需要指定
docker exec -it d0457afeeb1c
# 获取容器/镜像的元数据
docker inspect 容器/镜像
# 打包镜像
docker commit cure_his_struct_1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct_1.0.0
# 上传镜像
docker push dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct_1.0.0
# 更新apt源,安装某个包:
apt-get update
缺少sudo:
apt install net-tools
缺少vim:
apt install vim
缺少ssh:
apt install openssh-server
apt install openssh-client
# 重启ssh服务
/etc/init.d/ssh restart
# 关于MQ的docker
docker run -it -p 6653:6650 -p 8083:8080 --name pulsar_v3 apachepulsar/pulsar:latest
进入容器后,bin/pulsar standalone
docker exec -it 394fa08b3130 /bin/bash
docker exec -it 394fa08b3130 bin/pulsar standalone
docker run -it -p 6654:6650 -p 8084:8080 --name pulsar_v4 apachepulsar/pulsar:latest bin/pulsar standalone
# 容器启动:药品不良反应实体抽取
docker run -it -p 411:22 -p 10001:10001 -p 10000:10000 --name drug_introduction_v1 python:3.7.3 /bin/bash
# v1
# 将测试模型界面的容器(从南京k8s找容器和模型)部署到office2上,以提供外部可以访问的接口的步骤:
1.打开数据库:other_model_test_base_config表格,找到模型对应的title以及version和url(区分是model还是model-nj)
2.在mac上kubectl get pods -n model-nj,通过1的url找到对应的服务,例如hpi-v3-2-0-d55b49998-bv4nn,只取hpi-v3-2-0
3.kubectl get deploy hpi-v3-2-0 -n model-nj -o yaml,找到image: dockerdist.bdmd.com/hpi:20200518050636和nfs: path: /exports/ai-doctor-assistant/v3_2/service_hpi/v3_2_0 server: 192.168.101.176 (本路径为外边挂载的路径[是模型存在的路径],不是容器内的路径)
4.mount -o vers=4 -t nfs 192.168.101.176:/ ~/model 把server: 192.168.101.176的模型路径挂载在mac上
5.将3中路径的模型scp -r ./checkpoint_last.pt machaoyang@192.168.206.101:/home/machaoyang/拷贝到office2上,然后docker cp checkpoint_last.pt 70b58b98844b:/ai-doctor-assistant/models/v3_2/service_hpi/v3_2_0到容器内模型的路径下
# v2
# 将测试模型界面的容器(从南京k8s找容器和模型)部署到office2上,以提供外部可以访问的接口的步骤:
1.打开数据库:other_model_test_base_config表格,找到模型对应的title以及version和url(区分是model还是model-nj)
2.在mac上kubectl get pods -n model-nj,通过1的url找到模型对应的服务,例如hpi-v3-2-0-d55b49998-bv4nn,只取hpi-v3-2-0
3.kubectl get deploy hpi-v3-2-0 -n model-nj -o yaml,找到image: dockerdist.bdmd.com/hpi:20200518050636
4.mount -o vers=4 -t nfs 192.168.101.176:/ ~/model 把server: 192.168.101.176的模型路径挂载在office2上(挂载只是建了个目录映射,没有将模型从k8s上下载下来,所以挂载的目录文件不会占据磁盘空间,这在mac电脑上体现最明显,由于model的所有文件有65G下载是在太慢,所以只下载和模型测试相关的服务的模型)
5.复制office2上的model/ai-doctor-assistant(约13G)到另外一个目录,例如/home/machaoyang/Projects/models/ai-doctor-assistant, 为了与容器内存放模型文件的目录保持一致,在ai-doctor-assistant目录下新建models目录,然后将原来ai-doctor-assistant下的所有子目录移动到models
6.docker run -itd -p 11040:8080 -p 11041:8081 -v /home/machaoyang/Projects/model/ai-doctor-assistant/models:/ai-doctor-assistant/models --name per_his_v3.2.5 dockerdist.bdmd.com/per-his:v3.2.5 python app.py
kubectl get pod hpi-v3-2-0-d55b49998-bv4nn -n model-nj -o yaml 查找pod时需要带上-d55b49998-bv4nn不能只写hpi-v3-2-0
kubectl get deploy struct-ecg-v1-1 -n model-nj -o yaml
kubectl get deploy past-his-v3-2-0 -n model-nj -o yaml
kubectl get deploy fam-his-v3-2-0 -n model-nj -o yaml
kubectl get deploy cmh-chunk-v3-1-0 -n model-nj -o yaml
kubectl get deploy lab-rela-v3-0-1 -n model-nj -o yaml
kubectl get deploy struct-sym-v3-1-2 -n model-nj -o yaml
kubectl get deploy struct-phy-v3-2-0 -n model-nj -o yaml
kubectl get deploy struct-exam66-v1-1 -n model-nj -o yaml
# kubectl相关指令:
# 查看有哪些名称空间
kubectl get ns
获取指定名称空间的pod
kubectl get pod -n model
kubectl get pod -n model-nj
kubectl -n model edit deploy diagnosis-v2.8.1.3 -o yaml
kubectl -n model-nj edit deploy per-his-v3-2-5 -o yaml
kubectl -n model-nj edit deploy struct-sym-v3-1-3 -o yaml
docker run -it -p 11000:8080 -p 11001:8081 --name per_his_v3.2.5_v1 dockerdist.bdmd.com/per-his:v3.2.5 /bin/bash
docker exec -it per_his_v3.2.5 /bin/bash
docker run -it -p 11010:8080 -p 11011:8081 --name struct_symptom_v3_1_2 dockerdist.bdmd.com/service_struct_symptom:20190918074629 /bin/bash
docker run -it -p 11020:8080 -p 11021:8081 --name struct_symptom_v3_1_1 dockerdist.bdmd.com/models/struct-symptom:v3.1.1 /bin/bash
# 现病史分块
docker run -p 18080:8080 -it --gpus=all --name pre_his_chunk_gpu dockerdist.bdmd.com/cmh-chunk:v1 /bin/bash
# 治疗史结构化
docker cp /home/machaoyang/Projects/symptom_struct/zk_jws/ cure_his_struct_1.0.0:/symptom_struct/zk_jws/
docker commit cure_his_struct_1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct_1.0.0
docker run -p 3022:22 -p 8880:8080 -p 8881:8081 -p 8882:8081 -it --runtime=nvidia --name symptom_struct -v /home/machaoyang/Projects/symptom_struct:/symptom_struct dockerdist.bdmd.com/inquiry_cuda_gemotric:v2832 /bin/bash
docker run -p 4022:22 -p 8880:8080 -p 8881:8081 -p 8882:8081 -it --runtime=nvidia --name symptom_struct_v1 -v /home/machaoyang/Projects/symptom_struct:/symptom_struct dockerdist.bdmd.com/symptom_struct_cuda_gemotric:v1 /bin/bash
docker run -p 4022:22 -p 8880:8080 -p 8881:8081 -p 8882:8081 -it --runtime=nvidia --name cure_struct_v1.0.0 -v /home/machaoyang/Projects/symptom_struct:/symptom_struct dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct /bin/bash
docker run -p 11130:8080 -p 11131:8081 -itd -w /symptom_struct/zk_jws --runtime=nvidia --name cure_his_struct_1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct_1.0.0 python3 app.py
# mac 连接复兴
ssh -L 0.0.0.0:10020:10.8.0.6:22 machaoyang@192.168.206.102
ssh -p 10020 machaoyang@0.0.0.0
# 容器 连接复兴
ssh -L 0.0.0.0:10021:10.8.0.6:415 machaoyang@192.168.206.102
ssh -p 10021 root@0.0.0.0
每次mac待机后,需要重新执行一下ssh -L 0.0.0.0:10020:10.8.0.6:22 machaoyang@192.168.206.102和ssh -L 0.0.0.0:10021:10.8.0.6:415 machaoyang@192.168.206.102
scp -r /home/machaoyang/Projects/model_ner_test/ machaoyang@10.8.0.6:/data/machaoyang/model_ner_test/
# 体格检查结构化_GPU
docker run -it --gpus=all -p 12110:8080 -p 12111:8081 -p 12112:22 --name phy_struct_gpu dockerdist.bdmd.com/struct-phy:20190828041746_gpu_with_checkpoint /bin/bash
docker cp ../cpm-2-finetune prompt_learning_v8:/
docker cp phy_struct_gpu:/cpm-2-finetune ./
docker cp app.py flamboyant_hoover:/ai-doctor-assistant/v3_1_0
docker rm -f phy_struct_gpu
#docker run -itd -p 11030:8080 -p 11031:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0 --name phy_struct_3.2.0 dockerdist.bdmd.com/struct-phy:20190828041746 python app.py
docker run -it --runtime=nvidia -p 12110:8080 -p 12111:8081 -p 12112:22 -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0 --name phy_struct_gpu dockerdist.bdmd.com/struct-phy:20190828041746 /bin/bash
CUDA_VISIBLE_DEVICES=0 python app.py
docker run -it -p 13110:8080 -p 13111:8081 -p 13112:22 -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0 --name phy_struct_cpu dockerdist.bdmd.com/struct-phy:20190828041746 /bin/bash
docker run -it -p 11030:8080 -p 11031:8081 -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0 --name phy_struct_cpu dockerdist.bdmd.com/struct-phy:20190828041746 python app.py
docker commit phy_struct_gpu dockerdist.bdmd.com/struct-phy:20190828041746_zh
docker rm -f phy_struct_gpu_
docker run -it -p 12040:8080 -p 12041:8081 -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0 --name phy_struct_gpu_ dockerdist.bdmd.com/struct-phy:20190828041746_zh /bin/bash
docker run -it -p 12050:8080 -p 12051:8081 --name phy_struct_gpu_ dockerdist.bdmd.com/struct-phy:20190828041746 /bin/bash
docker commit cure_his_struct_1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:cure_struct_1.0.0_no_print
# 症状结构化_GPU
docker commit 7c7b0f255b6f dockerdist.bdmd.com/service_struct_symptom:20190918074629_gpu_with_checkpoint
docker run -it --gpus=all -p 11120:8080 -p 11121:8081 -p 11122:22 --name sym_struct_gpu dockerdist.bdmd.com/service_struct_symptom:20190918074629_gpu_with_checkpoint /bin/bash
docker cp checkpoint_best.pt 4ff9b9b77791:/model/checkpoint_best.pt
docker run --gpus=all -it -p 11130:8080 -p 11131:8081 -p 11132:22 --name sym_struct_gpu_3.1.2 dockerdist.bdmd.com/struct-phy:20190828041746_gpu_with_checkpoint_3.1.2 /bin/bash
docker cp checkpoint_best.pt sym_struct_gpu_3.1.2:/model/checkpoint_best.pt
docker run -it --gpus=all -it -p 11120:8080 -p 11121:8081 -p 11122:22 --name sym_struct_gpu_3.1.2 dockerdist.bdmd.com/service_struct_symptom:20190918074629_gpu_with_checkpoint_3.1.2 /bin/bash
##############################################################
kubectl -n model-nj edit deploy struct-sym-v3-1-2 -o yaml
# 症状结构化
dockerdist.bdmd.com/service_struct_symptom:20190918074629
scp -r ./checkpoint_last.pt machaoyang@192.168.206.102:/home/machaoyang/struct-sym-v3-1-2
docker cp checkpoint_last.pt 70b58b98844b:/ai-doctor-assistant/models/v3_2/service_hpi/v3_2_0
/exports/ai-services/struct_sym_v3_1_2
-v /Users/machaoyang/model/ai-services/struct_sym_v3_1_2:
docker run -it -p 11010:8080 -p 11011:8081 --name struct_symptom_v3_1_2 dockerdist.bdmd.com/service_struct_symptom:20190918074629 /bin/bash
docker cp checkpoint_best.pt struct_symptom_v3_1_2:/model/checkpoint_best.pt
#体格检查结构化
http://struct-phy-v3-2-0.model-nj:8080/predict
kubectl -n model-nj edit deploy struct-phy-v3-2-0 -o yaml
mountPath: /ai-doctor-assistant/models/v3_2/service_struct_physical_exam/v3_2_0
path: /exports/ai-doctor-assistant/v3_2/service_struct_physical_exam/v3_2_0
docker run -it -p 11030:8080 -p 11031:8081 --name struct_phy_v3_2_0 dockerdist.bdmd.com/struct-phy:20190828041746 /bin/bash
scp -r ./v3_2_0 machaoyang@192.168.206.102:/home/machaoyang/service_struct_physical_exam
docker cp v3_2_0 struct_phy_v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam
docker cp struct_phy_v3_2_0:/ai-doctor-assistant/models/v3_2/service_struct_physical_exam v3_2_0_052416
# struct_gd2h
docker run -it -p 12010:8080 -p 12011:8081 -p 12022:22 -v /home/machaoyang/Projects/struct_emr:/struct_emr --name struct_emr_v1 dockerdist.bdmd.com/cdss_klb_renew:v1.0125 /bin/bash
docker run -it --name struct_gd2h_v1 dockerdist.bdmd.com/cdss_klb_renew:v1.0125 /bin/bash
# ############################################################
# kubectl docker 常用命令
通过文件名或控制台输入,对资源进行配置。 如果资源不存在,将会新建一个
可以使用 JSON 或者 YAML 格式
Kubectl apply -f
docker build 命令用于使用 Dockerfile创建镜像
-f :指定要使用的Dockerfile路径
--tag, -t: 镜像的名字及标签,通常 name:tag 或者 name 格式;可以在一次构建中为一个镜像设置多个标签
docker build -f ./cdss/deploy/dev/Dockerfile -t dockerdist.bdmd.com/cdss:v1.0331 .
最后一个点是如果不指定-f则会从当前目录下找,如果指定-f的话则当前目录失效
# 后台模式
docker run -d ubuntu:15.10 /bin/sh -c "while true; do echo hello world; sleep 1; done"
在输出中,我们没有看到期望的 "hello world",而是一串长字符
2b1b7a428627c51ab8810d541d759f072b4fc75487eed05812646b8534a2fe63
这个长字符串叫做容器 ID,对每个容器来说都是唯一的,我们可以通过容器 ID 来查看对应的容器发生了什么。
加了 -d 参数默认不会进入容器,想要进入容器需要使用指令 docker exec 使用docker attach进入容器后退出容器会停止而docker exec进入容器退出容器后不会停止
当运行容器时,使用的镜像如果在本地中不存在,docker 就会自动从 docker 镜像仓库中下载,默认是从 Docker Hub 公共镜像源下载,不需要先docker pull
# 标注平台#####################
项目地址ai-platform app_annotation: 标注平台相关实现: 标注预览、文本标注、任务审核、任务管理、配置管理等 app_platform: AI平台相关实现: 原始病历、模型测试、知识图谱、产品演示等
服务在kubectl get pods -n platform
ai-platform-back-55c77cc4b-8bnp7 1/1 Running 0 5m47s
kubectl 部署服务
kubectl apply -f k8s.yaml
# 不良反应结构化
第一次进入容器:
docker run -it -p 12570:8080 -p 12571:8081 -p 1122:22 --name struct_adverse_reaction -v /home/machaoyang/Projects/demonstrate:/demonstrate dockerdist.bdmd.com/cdss_base:v1 /bin/bash
docker run -it --name ner_service_v1 ner-platform:1.0 /bin/bash
之后进入容器:
docker exec -it struct_adverse_reaction /bin/bash
cd /demonstrate/convert_to_triple/src
python app.py
# cdss_project
进入到项目根目录, /Users/machaoyang/Desktop/Projects/cdss_projects
/Users/machaoyang/Desktop/Projects/cdss_projects/extract_data
1. 以extract_data子项目为例,config的test为本地环境用于测试,dev为office1环境
2. 修改deploy.sh中的v1.0316,当然也可以不改
docker rmi dockerdist.bdmd.com/extract_data_cdss:v1.0316
docker build -f ./extract_data/deploy/dev/Dockerfile -t dockerdist.bdmd.com/extract_data_cdss:v1.0316 .
3. dockerfile文件不需要修改
4. 把项目文件scp到远程office1上
5. 运行 extract_data/deploy/dev/deploy.sh 然后docker ps 查看刚才操作最后得到的容器是否运行
# 当前目录
/Users/machaoyang/Desktop/Projects
scp -r cdss_sd machaoyang@192.168.206.101:/home/machaoyang/Projects/
docker logs convert_data_cdss --since 1m
容器挂载通过docker ps 无法看出(端口可以看出),可通过docker inspect看出
个人史结构化:
"Mounts": [
{
"Type": "bind",
"Source": "/home/machaoyang/Projects/symptom_struct",
"Destination": "/symptom_struct",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
##################################################################################################################
个人史结构化
MODEL_DIR=/home/machaoyang/Projects/model # 指定项目根目录
## 现病史分块_3.1.0
#sudo docker rm -f pre_his_chunk_3.1.0
##docker run -p 11000:8080 -p 11001:8081 -itd --runtime=nvidia --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 python app.py # 没有挂载,可以直接用 说明这个镜像是你后来自己封装的 把需要的模型文件加进去了
#docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
##docker run -itd -p 11000:8080 -p 11001:8081 --runtime=nvidia -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:v1 python app.py # 直接挂载到models 不知为啥挂载的v3_1_0目录下没有模型文件
sudo docker run -it -p 11000:8080 -p 11001:8081 --gpus="device=1" -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
sudo docker run -it -p 11000:8080 -p 11001:8081 --gpus="device=1" -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 /bin/bash
sudo docker run -it -p 11000:8080 -p 11001:8081 --gpus="device=1" -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 /bin/bash
sudo docker ./app.py pre_his_chunk_3.1.0:/ai-doctor-assistant/v3_1/service_cmh_chunk/v3_1_0
docker rm -f pre_his_chunk_3.1.0
docker run -itd -p 11000:8080 -p 11001:8081 --gpus="device=1" -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330 python app.py
将容器打包成镜像:
docker commit pre_his_chunk_3.1.0 dockerdist.bdmd.com/cmh-chunk:20191129024330_fix
推送镜像:
docker push dockerdist.bdmd.com/cmh-chunk:20191129024330_fix
镜像导出:
docker save -o /data/pytorch-nlp-devel:1.10.1.tar 172.16.20.21:1180/pie-engine-training/pytorch-nlp-devel:1.10.1
镜像导入:
docker load -i /data/pytorch-nlp-devel:1.10.1.tar
docker image tag 172.16.20.21:1180/pie-engine-training/pytorch-nlp-devel:1.10.1_with_dynaconf 172.16.20.21:1180/pie-engine-training/pytorch-nlp-devel:1.10.1
查看镜像或者容器的元数据:
docker inspect 镜像或容器的名称
sudo docker run -itd -p 11010:8080 -p 11011:8081 --gpus="device=1" -v /home/machaoyang/Projects/model/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0:/ai-doctor-assistant/models/v3_1/service_cmh_chunk/v3_1_0 --name pre_his_chunk_3.1.0_v1 dockerdist.bdmd.com/cmh-chunk:20191129024330_fix python app.py
# 治疗史结构化_1.0.0:
docker rm -f cure_his_v1.0.0
#docker run -itd -p 11130:8080 -p 11131:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name cure_his_v1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:v1 python app.py
#docker run -it -p 11130:8080 -p 11131:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models:/ai-doctor-assistant/models --name cure_his_v1.0.0 dockerdist.bdmd.com/symptom_struct_cuda_gemotric:v1 /bin/bash
## 化验结构化3.0.1:
docker rm -f lab_struct_3.0.1
docker run -itd -p 11020:8080 -p 11021:8081 -v ${MODEL_DIR}/ai-doctor-assistant/models/v3_0/service_lab_relation/v3_0_1:/ai-doctor-assistant/models/v3_0/service_lab_relation/v3_0_1 --name lab_struct_3.0.1 dockerdist.bdmd.com/lab-rela:20190828014444 python app.py
# ##################################################################################################################
输出pod中一个容器的日志。如果pod只包含一个容器则可以省略容器名。容器内程序输出到标准输出的内容。如要获得tail -f 的方式,也可以使用-f选项
# kubectl logs 常用于将容器中的日志导出 kubectl logs [-f] [-p] POD [-c CONTAINER] -p 如果为true,输出pod中曾经运行过,但目前已终止的容器的日志 -c, --container="": 容器名 如果一个pod中只有一个容器,则不用指定容器名
kubectl get pod -n cdss
kubectl logs redis-cdss-6db9465d7f-tzhz5 -n cdss
kubectl logs pulsar-8dfbd74dc-zgj6s -n cdss --since 1m
kubectl logs pulsar-8dfbd74dc-zgj6s -c 容器名或者容器id -n cdss --since 1m
kubectl get pod --all-namespaces -o wide # 所有命名空间
kubectl get pod -n cdss -o wide # 所有命名空间
k8s进入指定pod下的指定容器的命令
kubectl --namespace=default exec -it user-deployment-54469dd57-vg87g --container user -- sh
kubectl get pod -n pie-engine-common
kubectl get namespace
kubectl describe algo-549244570518724608-20230224150150-6c65746765-6zzk4
kubectl get pods -n pie-engine-common
如果直接的变量提取函数,没有调用,不一定这个函数就没真的调用,而是通过global函数调用了(如果配置表中也没有这个函数的名字,则就是没使用),其他间接函数则是整个项目中没有使用就是真的没有使用
# 安装复合某个版本区间的包
pip install "spacy-pkuseg>=0.0.27,<0.1.0"
# 搜索要安装的包的所有版本
pip install typing-extensions==
python train_reward_model.py --model "nghuyong/ernie-3.0-base-zh" --train_path "data/reward_datasets/sentiment_analysis/train.tsv" --dev_path "data/reward_datasets/sentiment_analysis/dev.tsv" --save_dir "checkpoints/reward_model/sentiment_analysis" --img_log_dir "logs/reward_model/sentiment_analysis" --img_log_name "ERNIE Reward Model" --batch_size 32 --max_seq_len 128 --learning_rate 1e-5 --valid_steps 50 --logging_steps 10 --num_train_epochs 10 --device "cuda:0"
--model "data/roberta-base-finetuned-jd-binary-chinese" --train_path "data/reward_datasets/sentiment_analysis/train.tsv" --dev_path "data/reward_datasets/sentiment_analysis/dev.tsv" --save_dir "checkpoints/reward_model/sentiment_analysis" --img_log_dir "logs/reward_model/sentiment_analysis" --img_log_name "ERNIE Reward Model" --batch_size 32 --max_seq_len 128 --learning_rate 1e-5 --valid_steps 50 --logging_steps 10 --num_train_epochs 2 --device "cpu"
docker commit python3_jieba rackspacedot/python37:chatGLM
wget https://github.com/fatedier/frp/releases/download/v0.32.1/frp_0.32.1_linux_amd64.tar.gz
tar -zxvf frp_0.32.1_linux_amd64.tar.gz
根据端口占用的进程:
(base) htht@htht:~/mcy_frp/frp_0.32.1_linux_amd64$ lsof -i:7000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
frpc 5904 htht 6u IPv4 75992 0t0 TCP bogon:60230->8.130.126.131:bbs (ESTABLISHED)
(base) htht@htht:~/mcy_frp/frp_0.32.1_linux_amd64$
根据PID/COMMAND查看启动的命令文件
ps -ef|grep PID/COMMAND
root@iZ0jl16snshllvywzyv044Z:~# ps -ef|grep 76140
root 76140 1 0 Jun05 ? 00:13:44 ./frps -c ./frps.ini
进程的启动时间和运行时间:
ps -o lstart,etime -p 76140
重启:
shutdown -r now
8.130.126.131 6000
machaoyang
Mcy394964881
apt-get -y update
apt-get -y install openssl@1.1
nohup python spert.py train --config configs/example_train.conf > train_log.log 2>&1 &
CUDA_VISIBLE_DEVICES=2 nohup python spert.py train --config configs/first_game/example_train.conf > train_log.log 2>&1 &
find / -name bert-base-chinese
# 默认是只显示自己账号下的 加上sudo 则显示的所有的
(chatGLM) machaoyang@htht:~/Projects/text_game/event_extraction_v2/src/sequence_labeling$ fuser -v /dev/nvidia*
USER PID ACCESS COMMAND
/dev/nvidia0: machaoyang 3960986 F...m python
/dev/nvidia1: machaoyang 3960986 F...m python
/dev/nvidia2: machaoyang 3960986 F...m python
/dev/nvidia3: machaoyang 3960986 F...m python
/dev/nvidiactl: machaoyang 3960986 F...m python
/dev/nvidia-uvm: machaoyang 3960986 F...m python
sudo systemctl list-unit-files | grep enabled
查看系统版本:
lsb_release -a
查看系统开机了多长时间:
who -b
docker 为用户组名 查看用户组下有哪些用户
cat /etc/group | grep docker
重启docker
systemctl restart docker
查看是否安装NVIDIA显卡
lspci | grep -i nvidia
检查可安装的驱动
(base) htht@htht:~$ ubuntu-drivers devices
vendor : NVIDIA Corporation
model : TU102GL [Quadro RTX 6000/8000] (Quadro RTX 8000)
driver : nvidia-driver-525 - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : nvidia-driver-525-open - distro non-free
driver : nvidia-driver-535 - distro non-free
driver : nvidia-driver-535-open - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-535-server - distro non-free
driver : nvidia-driver-525-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-535-server-open - distro non-free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
查看安装的驱动版本:
nvidia-smi 显示的是当前驱动支持的最高版本的cuda,要安装的cuda版本不应高于这个版本 如果输入这个指令有反应在表示显卡驱动没问题
查看cuda版本:
nvcc -V
一个包的setup.py中的包依赖比requirements.txt的有效,如果后者不能让包安装成功,则可以去setup.py中找
对于复杂的任务 例如代码生成 见过标注数据的模型肯定是比gpt-3 zero-shot出来的好的 对于简单任务,gpt-3还好
docker run -d --gpus '"device=2"' -p 80:8501 -p 1022:22 --name langchain registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3
docker run --gpus '"device=2"' -p 80:8501 -p 1022:22 --name langchain registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3
docker build -f ./Dockerfile -t spark_graphframes:1.0 .
pip install git+https://github.com/graphframes/graphframes.git
pip install --default-timeout=100 loguru
du -h .
容器创建时挂载目录 在容器中删除文件会导致容器外挂载的文件删除 但是对容器commit为镜像时挂载目录的文件不会打包到镜像