|Commenced in January 2007||Frequency: Monthly||Edition: International||Paper Count: 14|
Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.
Vehicle Routing Problem (VRP) is a complex combinatorial optimization problem and it is quite difficult to find an optimal solution consisting of a set of routes for vehicles whose total cost is minimum. Evolutionary and swarm intelligent (SI) algorithms play a vital role in solving optimization problems. While the SI algorithms perform search, the diversity between the solutions they exploit is very important. This is because of the need to avoid early convergence and to get an appropriate balance between the exploration and exploitation. Therefore, it is important to check how far the solutions are diverse. In this paper, we measure the similarity between solutions, which ABC exploits while optimizing VRP. The similar solutions found are discarded at the end of the iteration and only unique solutions are passed on to the next iteration. The bees of discarded solutions become scouts and they start searching for new solutions. This process is continued and results show that the solution is optimized at lesser number of iterations but with the overhead of computing similarity in all the iterations. The problem instance from Solomon benchmarked dataset has been used for evaluating the presented methodology.
The system is designed to show images which are related to the query image. Extracting color, texture, and shape features from an image plays a vital role in content-based image retrieval (CBIR). Initially RGB image is converted into HSV color space due to its perceptual uniformity. From the HSV image, Color features are extracted using block color histogram, texture features using Haar transform and shape feature using Fuzzy C-means Algorithm. Then, the characteristics of the global and local color histogram, texture features through co-occurrence matrix and Haar wavelet transform and shape are compared and analyzed for CBIR. Finally, the best method of each feature is fused during similarity measure to improve image retrieval effectiveness and accuracy.
In this work, we begin with the presentation of the Tθ family of usual similarity measures concerning multidimensional binary data. Subsequently, some properties of these measures are proposed. Finally the impact of the use of different inter-elements measures on the results of the Agglomerative Hierarchical Clustering Methods is studied.
The primary objective of the paper is to propose a new method for solving assignment problem under uncertain situation. In the classical assignment problem (AP), zpqdenotes the cost for assigning the qth job to the pth person which is deterministic in nature. Here in some uncertain situation, we have assigned a cost in the form of composite relative degree Fpq instead of and this replaced cost is in the maximization form. In this paper, it has been solved and validated by the two proposed algorithms, a new mathematical formulation of IVIF assignment problem has been presented where the cost has been considered to be an IVIFN and the membership of elements in the set can be explained by positive and negative evidences. To determine the composite relative degree of similarity of IVIFS the concept of similarity measure and the score function is used for validating the solution which is obtained by Composite relative similarity degree method. Further, hypothetical numeric illusion is conducted to clarify the method’s effectiveness and feasibility developed in the study. Finally, conclusion and suggestion for future work are also proposed.
Prediction of future research topics by using time series analysis either statistical or machine learning has been conducted previously by several researchers. Several methods have been proposed to combine the forecasting results into single forecast. These methods use fixed combination of individual forecast to get the final forecast result. In this paper, quite different approach is employed to select the forecasting methods, in which every point to forecast is calculated by using the best methods used by similar validation dataset. The dataset used in the experiment is time series derived from research report in Garuda, which is an online sites belongs to the Ministry of Education in Indonesia, over the past 20 years. The experimental result demonstrates that the proposed method may perform better compared to the fix combination of predictors. In addition, based on the prediction result, we can forecast emerging research topics for the next few years.
Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.