头像
代码知识图谱构建及智能化软件开发方法研究(2020 软件学报)
Email:
Office:
Address:
PostCode:
Fax:
头像
支持范围查询的低冗余知识图谱管理(2019 计算机研究与发展)
Email:
Office:
Address:
PostCode:
Fax:
头像
Patent expanded retrieval via word embedding under composite-domain perspectives(2019 Frontiers of Computer Science)
Email:
Office:
Address:
PostCode:
Fax:

您是第167Access

  • 智能化软件开发正在经历从简单的代码检索到语义赋能的代码自动生成的转变,传统的语义表达方式无法有效地支撑人、机器和代码之间的语义交互,探索机器可理解的语义表达机制迫在眉睫.首先指出了代码知识图谱是实现智能化软件开发的基础,进而分析了大数据时代智能化软件开发的新特点以及基于代码知识图谱进行智能化软件开发的新挑战;随后回顾了智能化软件开发和代码知识图谱的研究现状,指出了现有智能化软件开发的研究仍然处于较低水平,而现有知识图谱的研究主要面向开放领域知识图谱,无法直接应用于代码领域知识图谱.因此,从代码知识图谱的建模与表示、构建与精化、存储与演化管理、查询语义理解以及智能化应用这5个方面详细探讨了研究新趋势,以更好地满足基于代码知识图谱进行智能化软件开发的需要.


    详细信息:http://www.jos.org.cn/html/2020/1/5893.htm


    代码知识图谱构建及智能化软件开发方法研究综述.pdf

  • 随着越来越多的数据以知识图谱的形式进行组织和发布,知识图谱的管理引起了大量的关注.现有知识图谱管理方法存在2个明显的缺陷:1)逻辑存储建模产生了大量的数据冗余,无法有效地支持连续属性的范围查询;2)语义存储建模代价昂贵,不能有效地适应查询的动态演化.提出了聚簇对象代理模型(cluster object deputy model, CODM)进行知识和元知识的建模管理.该模型具有2个特点,分别是模式化的逻辑存储建模和轻量级的语义存储建模.CODM设计了基于集合编辑距离的模式聚簇算法将知识图谱转化为模式数据,实现了数据的模式化存储,支持了面向属性数据类型的索引特化.此外,CODM构建类的层次系统建模实体之间的各种语义关联,采用对象指针实现了轻量级的泛化语义关联物化.实验结果证明:CODM不仅能够极大地减少数据冗余和有效地支持范围查询,而且加速了复杂查询的处理效率.


    详细信息:https://crad.ict.ac.cn/CN/abstract/abstract3993.shtml


    支持范围查询的低冗余知识图谱管理.pdf

  • Patent prior art search uses dispersed informationto retrieve all the relevant documents with strong ambiguityfrom the massive patent database. This challenging task consists in patent reduction and patent expansion. Existing studies on patent reduction ignore the relevance between technical characteristics and technical domains, and result in ambiguous queries. Works on patent expansion expand termsfrom external resource by selecting words with similar distribution or similar semantics. However, this splits the relevance between the distribution and semantics of the terms.Besides, common repository hardly meets the requirementof patent expansion for uncommon semantics and unusualterms. In order to solve these problems, we first present anovel composite-domain perspective model which convertsthe technical characteristic of a query patent to a specificcomposite classified domain and generates aspect queries.We then implement patent expansion with double consistencyby combining distribution and semantics simultaneously. Wealso propose to train semantic vector spaces via word embedding under the specific classified domains, so as to provide domain-aware expanded resource. Finally, multiple retrieval results of the same topic are mergedbased on perspective weight and rank in the results. Our experimental resultson CLEP-IP 2010 demonstrate that our method is very effective. It reaches about 5.43%improvementin recall and nearly12.38% improvement in PRES over the state-of-the-art. Ourwork also achieves the best performance balance in terms ofrecall, MAP and PRES.


    详细信息:https://journal.hep.com.cn/fcs/EN/10.1007/s11704-018-7056-6


    Patent expanded retrieval via word embedding under composite-domain perspectives.pdf

Baidu
map