Commenced in January 2007 | Frequency: Monthly | Edition: International | Paper Count: 235 |
The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.
Information seekers get “lost in hyperspace” due to the voluminous documents updated daily on the internet. Adaptive Hypermedia Systems (AHS) are used to direct learners to their target goals. One of the most common AHS designed to help information seekers to overcome the problem of information overload is the Adaptive Education Hypermedia System (AEHS). However, this paper focuses on AEHS that adopts the learning preference of high school students and deliver learning content according to this preference throughout their learning experience. The research developed a prototype system for predicting students’ learning preference from the Visual, Aural, Read-Write and Kinesthetic (VARK) learning style model and adopting the learning content suitable to their preference. The predicting strength of several classifiers was compared and we found Support Vector Machine (SVM) to be more accurate in predicting learning style based on users’ preferences.
Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We present a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.
Predicting the risk of Pancreatic Adenocarcinoma (PA) in advance can benefit the quality of care and potentially reduce population mortality and morbidity. The aim of this study was to develop and prospectively validate a risk prediction model to identify patients at risk of new incident PA as early as 3 months before the onset of PA in a statewide, general population in Maine. The PA prediction model was developed using Deep Neural Networks, a deep learning algorithm, with a 2-year electronic-health-record (EHR) cohort. Prospective results showed that our model identified 54.35% of all inpatient episodes of PA, and 91.20% of all PA that required subsequent chemoradiotherapy, with a lead-time of up to 3 months and a true alert of 67.62%. The risk assessment tool has attained an improved discriminative ability. It can be immediately deployed to the health system to provide automatic early warnings to adults at risk of PA. It has potential to identify personalized risk factors to facilitate customized PA interventions.
Conventional reservoir prediction methods ar not sufficient to explore the implicit relation between seismic attributes, and thus data utilization is low. In order to improve the predictive classification accuracy of reservoir lithology, this paper proposes a deep learning lithology prediction method based on ResNet (Residual Neural Network) and SENet (Squeeze-and-Excitation Neural Network). The neural network model is built and trained by using seismic attribute data and lithology data of Shengli oilfield, and the nonlinear mapping relationship between seismic attribute and lithology marker is established. The experimental results show that this method can significantly improve the classification effect of reservoir lithology, and the classification accuracy is close to 70%. This study can effectively predict the lithology of undrilled area and provide support for exploration and development.
Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.
We have designed a Holter that measures the heart´s activity for over 24 hours, implemented a prediction methodology, and generate alarms as well as indicators to patients and treating physicians. Various diagnostic advances have been developed in clinical cardiology thanks to Holter implementation; however, their interpretation has largely been conditioned to clinical analysis and measurements adjusted to diverse population characteristics, thus turning it into a subjective examination. This, however, requires vast population studies to be validated that, in turn, have not achieved the ultimate goal: mortality prediction. Given this context, our Insight Research Group developed a mathematical methodology that assesses cardiac dynamics through entropy and probability, creating a numerical and geometrical attractor which allows quantifying the normalcy of chronic and acute disease as well as the evolution between such states, and our Tigum Research Group developed a holter device with 12 channels and advanced computer software. This has been shown in different contexts with 100% sensitivity and specificity results.
The shift towards decision making (DM) based on artificial intelligence (AI) techniques will change the way in which consumer markets and our societies function. Through AI, predictive analytics is being used by businesses to identify these patterns and major trends with the objective to improve the DM and influence future business outcomes. This paper proposes an Artificial Neural Network (ANN) approach to predict the success of telemarketing calls for selling bank long-term deposits. To validate the proposed model, we uses the bank marketing data of 41188 phone calls. The ANN attains 98.93% of accuracy which outperforms other conventional classifiers and confirms that it is credible and valuable approach for telemarketing campaign managers.
This study investigated factors affecting the price of cement in Nigeria, and developed a mathematical model that can predict future cement prices. Cement is key in the Nigerian construction industry. The changes in price caused by certain factors could affect economic and infrastructural development; hence there is need for proper proactive planning. Secondary data were collected from published information on cement between 2014 and 2019. In addition, questionnaires were sent to some domestic cement retailers in Port Harcourt in Nigeria, to obtain the actual prices of cement between the same periods. The study revealed that the most critical factors affecting the price of cement in Nigeria are inflation rate, population growth rate, and Gross Domestic Product (GDP) growth rate. With the use of data from United Nations, International Monetary Fund, and Central Bank of Nigeria databases, amongst others, a Multiple Linear Regression model was formulated. The model was used to predict the price of cement for 2020-2025. The model was then tested with 95% confidence level, using a two-tailed t-test and an F-test, resulting in an R2 of 0.8428 and R2 (adj.) of 0.6069. The results of the tests and the correlation factors confirm the model to be fit and adequate. This study will equip researchers and stakeholders in the construction industry with information for planning, monitoring, and management of present and future construction projects that involve the use of cement.
Compared with terrestrial network, the traffic of spatial information network has both self-similarity and short correlation characteristics. By studying its traffic prediction method, the resource utilization of spatial information network can be improved, and the method can provide an important basis for traffic planning of a spatial information network. In this paper, considering the accuracy and complexity of the algorithm, the spatial information network traffic is decomposed into approximate component with long correlation and detail component with short correlation, and a time series hybrid prediction model based on wavelet decomposition is proposed to predict the spatial network traffic. Firstly, the original traffic data are decomposed to approximate components and detail components by using wavelet decomposition algorithm. According to the autocorrelation and partial correlation smearing and truncation characteristics of each component, the corresponding model (AR/MA/ARMA) of each detail component can be directly established, while the type of approximate component modeling can be established by ARIMA model after smoothing. Finally, the prediction results of the multiple models are fitted to obtain the prediction results of the original data. The method not only considers the self-similarity of a spatial information network, but also takes into account the short correlation caused by network burst information, which is verified by using the measured data of a certain back bone network released by the MAWI working group in 2018. Compared with the typical time series model, the predicted data of hybrid model is closer to the real traffic data and has a smaller relative root means square error, which is more suitable for a spatial information network.
Power transformers are the most crucial part of power electrical system, distribution and transmission grid. This part is maintained using predictive or condition-based maintenance approach. The diagnosis of power transformer condition is performed based on Dissolved Gas Analysis (DGA). There are five main methods utilized for analyzing these gases. These methods are International Electrotechnical Commission (IEC) gas ratio, Key Gas, Roger gas ratio, Doernenburg, and Duval Triangle. Moreover, due to the importance of the transformers, there is a need for an accurate technique to diagnose and hence predict the transformer condition. The main objective of this technique is to avoid the transformer faults and hence to maintain the power electrical system, distribution and transmission grid. In this paper, the DGA was utilized based on the data collected from the transformer records available in the General Electricity Company of Libya (GECOL) which is located in Benghazi-Libya. The Fuzzy Logic (FL) technique was implemented as a diagnostic approach based on IEC gas ratio method. The FL technique gave better results and approved to be used as an accurate prediction technique for power transformer faults. Also, this technique is approved to be a quite interesting for the readers and the concern researchers in the area of FL mathematics and power transformer.
A rain cell ratio model is proposed that computes attenuation of the smallest rain cell which represents the maximum rain rate value i.e. the cell size when rainfall rate is exceeded 0.01% of the time, R0.01 and predicts attenuation for other cells as the ratio with this maximum. This model incorporates the dependence of the path factor r on the ellipsoidal path variation of the Fresnel zone at different frequencies. In addition, the inhomogeneity of rainfall is modeled by a rain drop packing density factor. In order to derive the model, two empirical methods that can be used to find rain cell size distribution Dc are presented. Subsequently, attenuation measurements from different climatic zones for terrestrial radio links with frequencies F in the range 7-38 GHz are used to test the proposed model. Prediction results show that the path factor computed from the rain cell ratio technique has improved reliability when compared with other path factor and effective rain rate models, including the current ITU-R 530-15 model of 2013.
After the widespread release of electronic trading, automated trading systems have become a significant part of the business intelligence system of any modern financial investment company. An important part of the trades is made completely automatically today by computers using mathematical algorithms. The trading decisions are taken almost instantly by logical models and the orders are sent by low-latency automatic systems. This paper will present a real-time price prediction methodology designed especially for algorithmic trading. Based on the price cyclicality function, the methodology revealed will generate price cyclicality bands to predict the optimal levels for the entries and exits. In order to automate the trading decisions, the cyclicality bands will generate automated trading signals. We have found that the model can be used with good results to predict the changes in market behavior. Using these predictions, the model can automatically adapt the trading signals in real-time to maximize the trading results. The paper will reveal the methodology to optimize and implement this model in automated trading systems. After tests, it is proved that this methodology can be applied with good efficiency in different timeframes. Real trading results will be also displayed and analyzed in order to qualify the methodology and to compare it with other models. As a conclusion, it was found that the price prediction model using the price cyclicality function is a reliable trading methodology for algorithmic trading in the financial market.
Diabetes is a medical condition that can lead to various diseases such as stroke, heart disease, blindness and obesity. In clinical practice, the concern of the diabetic patients towards the blood glucose examination is rather alarming as some of the individual describing it as something painful with pinprick and pinch. As for some patient with high level of glucose level, pricking the fingers multiple times a day with the conventional glucose meter for close monitoring can be tiresome, time consuming and painful. With these concerns, several non-invasive techniques were used by researchers in measuring the glucose level in blood, including ultrasonic sensor implementation, multisensory systems, absorbance of transmittance, bio-impedance, voltage intensity, and thermography. This paper is discussing the application of the near-infrared (NIR) spectroscopy as a non-invasive method in measuring the glucose level and the implementation of the linear system identification model in predicting the output data for the NIR measurement. In this study, the wavelengths considered are at the 1450 nm and 1950 nm. Both of these wavelengths showed the most reliable information on the glucose presence in blood. Then, the linear Autoregressive Moving Average Exogenous model (ARMAX) model with both un-regularized and regularized methods was implemented in predicting the output result for the NIR measurement in order to investigate the practicality of the linear system in this study. However, the result showed only 50.11% accuracy obtained from the system which is far from the satisfying results that should be obtained.
Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.
Crude oil market is an immensely complex and dynamic environment and thus the task of predicting changes in such an environment becomes challenging with regards to its accuracy. A number of approaches have been adopted to take on that challenge and machine learning has been at the core in many of them. There are plenty of examples of algorithms based on machine learning yielding satisfactory results for such type of prediction. In this paper, we have tried to predict crude oil prices using Long Short-Term Memory (LSTM) based recurrent neural networks. We have tried to experiment with different types of models using different epochs, lookbacks and other tuning methods. The results obtained are promising and presented a reasonably accurate prediction for the price of crude oil in near future.
In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.
There is a great challenge for civil engineering field to contribute in environment prevention by finding out alternatives of cement and natural aggregates. There is a problem of global warming due to cement utilization in concrete, so it is necessary to give sustainable solution to produce concrete containing waste. It is very difficult to produce designated grade of concrete containing different ingredient and water cement ratio including waste to achieve desired fresh and harden properties of concrete as per requirement and specifications. To achieve the desired grade of concrete, a number of trials have to be taken, and then after evaluating the different parameters at long time performance, the concrete can be finalized to use for different purposes. This research work is carried out to solve the problem of time, cost and serviceability in the field of construction. In this research work, artificial neural network introduced to fix proportion of concrete ingredient with 50% waste replacement for M20, M25, M30, M35, M40, M45, M50, M55 and M60 grades of concrete. By using the neural network, mix design of high performance concrete was finalized, and the main basic mechanical properties were predicted at 3 days, 7 days and 28 days. The predicted strength was compared with the actual experimental mix design and concrete cube strength after 3 days, 7 days and 28 days. This experimentally and neural network based mix design can be used practically in field to give cost effective, time saving, feasible and sustainable high performance concrete for different types of structures.
Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.