医疗知识图谱问答——文本分类解析
前言
Neo4j的数据库构建完成后,现在就是要实现医疗知识的解答功能了。因为是初版,这里的问题解答不会涉及深度学习,目前只是一个条件查询的过程。而这个过程包括对问题的关键词拆解分类,然后提取词语和类型去图数据库查询,最后就是根据查询结果和问题类型组装语言完成回答,那么以下就是完成这个过程的全部代码流程了。
环境
这里所需的环境除了前面提到的外,还需要ahocorasick库,用于从问题中提取关键词。另一个是colorama,用于给输出面板文字美化的库。
编码
1. 问答面板
2. 问题归类
3. 类型解析(查询组装)
4. 数据查询(回答组装)
from py2neo import Graph, Node class Answer: def __init__(self): self.neo4j = Graph('bolt://localhost:7687', auth=('neo4j', 'beiqiaosu123456')) self.num_limit = 20 def main(self, question_parse): answers_final = [] for item in question_parse: question_type = item['qustion_type'] sqls = item['sql'] answer = [] for sql in sqls: data = self.neo4j.run(sql) answer+=data.data() final_answer = self.answer_prettify(question_type, answer) if final_answer: answers_final.append(final_answer) return answers_final '''根据对应的qustion_type,调用相应的回复模板''' def answer_prettify(self, question_type, answers): final_answer = [] if not answers: return '' if question_type == 'disease_symptom': desc = [i['n.name'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'symptom_disease': desc = [i['m.name'] for i in answers] subject = answers[0]['n.name'] final_answer = '症状{0}可能染上的疾病有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_cause': desc = [i['m.cause'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}可能的成因有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_prevent': desc = [i['m.prevent'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}的预防措施包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_lasttime': desc = [i['m.cure_lasttime'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}治疗可能持续的周期为:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_cureway': desc = [';'.join(i['m.cure_way']) for i in answers] subject = answers[0]['m.name'] final_answer = '{0}可以尝试如下治疗:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_cureprob': desc = [i['m.cured_prob'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}治愈的概率为(仅供参考):{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_easyget': desc = [i['m.easy_get'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}的易感人群包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_desc': desc = [i['m.desc'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0},熟悉一下:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_acompany': desc1 = [i['n.name'] for i in answers] desc2 = [i['m.name'] for i in answers] subject = answers[0]['m.name'] desc = [i for i in desc1 + desc2 if i != subject] final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_not_food': desc = [i['n.name'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}忌食的食物包括有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_do_food': do_desc = [i['n.name'] for i in answers if i['r.name'] == '可以吃'] recommand_desc = [i['n.name'] for i in answers if i['r.name'] == '推荐吃'] subject = answers[0]['m.name'] final_answer = '{0}宜食的食物包括有:{1}\n推荐食谱包括有:{2}'.format(subject, ';'.join(list(set(do_desc))[:self.num_limit]), ';'.join(list(set(recommand_desc))[:self.num_limit])) elif question_type == 'food_not_disease': desc = [i['m.name'] for i in answers] subject = answers[0]['n.name'] final_answer = '患有{0}的人最好不要吃{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject) elif question_type == 'food_do_disease': desc = [i['m.name'] for i in answers] subject = answers[0]['n.name'] final_answer = '患有{0}的人建议多试试{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject) elif question_type == 'disease_drug': desc = [i['n.name'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}通常的使用的药品包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'drug_disease': desc = [i['m.name'] for i in answers] subject = answers[0]['n.name'] final_answer = '{0}主治的疾病有{1},可以试试'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'disease_check': desc = [i['n.name'] for i in answers] subject = answers[0]['m.name'] final_answer = '{0}通常可以通过以下方式检查出来:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) elif question_type == 'check_disease': desc = [i['m.name'] for i in answers] subject = answers[0]['n.name'] final_answer = '通常可以通过{0}检查出来的疾病有{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit])) return final_answer
写在最后
以上就是这个医疗知识问答机器人的全部代码了,从上面的问答里也能看出,回答得还是很生硬。因为这就只是一个程序化得思维导图,所以修改完善空间还是很大,这个就要后期用深度学习得方式对分类解析部分进行改动。
个人网站:www.zerofc.cn
公众号:ZEROFC_DEV
QQ交流群:515937120
QQ:2652364582
头条号:1637769351151619
B站:286666708
大鱼号:北桥苏