跳至内容
售后
用户工具
登录
站点工具
搜索
工具
显示页面
修订记录
反向链接
最近更改
媒体管理器
网站地图
登录
>
最近更改
媒体管理器
网站地图
您的足迹:
tcdb膜转运数据库
编辑本页后请点击“保存”。请参阅
syntax
了解维基语法。只有在您能
改进
该页面的前提下才编辑它。如果您想尝试一些东西,请先到
playground
热身。
媒体文件
数据库官网 https://tcdb.org/download.php 脚本 /TJPROJ6/RNA_SH/script_dir/annotation/TCDB/TCDB_blast.py 如果要增加二级注释请使用下面的脚本 <code> import sys,re import requests reload(sys) sys.setdefaultencoding('utf8') session = requests.session() #annot_open = open("annot.xls").readlines() #annot = {} #for each in annot_open: # each_lines = each.strip().split("-") # annot[each_lines[0].strip()] = each_lines[1].strip() def get_subfamily(ids): info_url = "https://tcdb.org/getinfo.php?id=" + ids headers = { "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36" } resp = session.post(info_url,headers=headers) try: if resp.status_code == 200: html=resp.json() name = html["name"] return name return None except requests.ConnectionError as e: print('Error',e.args) xls_open = open(sys.argv[1]).readlines() result = {} for each in xls_open[1:]: each_lines = each.strip().split("\t") type_ = each_lines[1].split("|")[-1] cluster = each_lines[1].split("|")[-2] if re.search(r"([0-9].[A-Z,a-z].[0-9]+)",type_): each_type = re.search(r"([0-9].[A-Z,a-z].[0-9]+)",type_).group(1) else: print(each_) if each_type not in result.keys(): result[each_type] = {} result[each_type] = {"cluster_number":[],"gene_number":0} if cluster not in result[each_type]["cluster_number"]: result[each_type]["cluster_number"].append(cluster) result[each_type]["gene_number"] = 1 else: if cluster not in result[each_type]["cluster_number"]: result[each_type]["cluster_number"].append(cluster) result[each_type]["gene_number"] = result[each_type]["gene_number"] + 1 name = sys.argv[1].strip("_tcdb.xls").split("/")[0] #print("name"+"\t"+"\t".join(result.keys())+"\n"+name+"\t"+"\t".join(str(each) for each in result.values())) for each_key,each_value in result.items(): print(name+"\t"+each_key+"\t"+str(len(each_value["cluster_number"]))+"\t"+str(each_value["gene_number"])+"\t"+get_subfamily(each_key)) # print(get_subfamily(each_key)) </code>
保存
预览
取消
编辑摘要
当您选择开始编辑本页,即寓示你同意将你贡献的内容按下列许可协议发布:
CC Attribution-Share Alike 4.0 International
tcdb膜转运数据库.txt
· 最后更改: 2023/01/05 06:41 由
fengjie
页面工具
显示页面
修订记录
反向链接
回到顶部