GIS Assignment Help Sample Paper
When it comes to GIS assignment help, experience and skills matter. Particularly, our writers, who completed this GIS homework, are available at your request. We have helped countless clients with such tasks; in fact, we are renowned for the best writing assistance in arcgis online help. We would be happy to help you write high-grade papers. Below is a sample paper, including the instructions, which was written by our writers:
Write a literature review of Point Density and Kernel Density Estimation (KDE) techniques/ algorithms for the purpose of analyzing geo-tagged twitter data. In particular, some of the algorithms I’m looking at are DBSCAN, ST-DBSCAN, MR-VDBSCAN, KDE (the Silverman equation), and adaptive KDE (like Brundon’s algorithm). I’d like more information on them, as well as similar and related algorithms/ techniques that are used for analyzing geospatial point data
GIS Assignment Help (Analysis of Geo-tagged Data)
Recently, the use of Online Social Networks, OSN has been a key area of research in machine learning processes, statistical learning, natural language learning and in the sector of sensors. The aim of using the technique is to detect, monitor, and publish information about an incident or an event in a timely manner supported by a location-based system. Twitter has emerged as a key platform to post events such as crime, floods, wildfires, diseases, and earthquakes among others that create awareness.
The social platform has been used successfully to monitor various studies using varying models. They include earthquakes detection using a probabilistic temporal model and the article claims the outcome has a 96% probability to detect earthquakes correctly using text posted on twitter (Sakaki et al., 2010). Other reported studies are in the field of public health, natural disasters, monitoring human behavior, and mining data among others that effectively use twitter to extract information (Delso et al., 2018; Matheson, 2018; Hernandez-Suarez et al., 2019).
All these studies applying varying techniques in understanding the collected data and in analyzing the information from the online social networks. Improvements in computing and deep learning have enhanced algorithms supported by spatial analysis to understand patterns with geotagged social media images and texts (Koylu et al., 2019).
Two distinct approaches that emerge in these studies are the techniques adopted in geoparsing data from the online social network. These are:
Vector Space Feature representation – The model captures the relevant words in the tweet, it assigns numerical weight and the sentence is represented based on density, either as dense or sparse vector with an assigned value of V. Common algorithms with this approach are one-hot encoding, bag of words, and Term Frequency –Inverse document frequency. The model however has challenges, as it does not preserve semantic, linguistic and syntactic features thus it cannot establish the relationship between terminologies and it becomes a challenge to analyze contiguous data.
Algorithm Selection – The geoparsing techniques are developed on the Named Entity Recognition, NER, that can be analyzed using several algorithms and a bit of preprocessing to train the sequential structure of the tweet. The algorithm approach is perceived to be better and significant advances have been made in this field to reduce the disadvantages and to enhance the geoparsing strategy.
Several studies in summary promote the processing step into two steps: Training the data and the Sensor step. In training, the data is trained using the named entity tags, then a bit of preprocessing is done to remove noise, embedding words using a learning algorithm and running the recurrent Neural Network, RNN with a CRF output and biLSTM that gives the geographic description of the tweet and the time (Delso et al., 2018). In the sensing step, the twitter data is provided for querying, preprocessing, and word embedding process is initiated, classification is done using the trained dataset in the first step, geoparsing, and geocoding is done to identify the spatial location of the data using latitude and longitude data. The final step is running an algorithm to identify the relationship in the dataset, which can be through spatial regression. The final step was to understand the dynamic of the event, where it is common, frequency of reporting, and the spatial extent of the incident. In this step several approaches can be used and they are DBSCAN, ST-DBSCAN, MR-VDBSCAN, KDE (the Silverman equation), and adaptive KDE.
Figure 1: A sample process used in analyzing geotagged social network data. Source: Hernandez-Suarez et al., 2019.
As introduced above, the clustering algorithms are developed for analyzing geo-tagged social network data. Clustering approach is classified into five types which are hierarchical, partitional, model-based, grid-based and density based (Hernandez-Suarez et al., 2019). Density based methods are the most popular clustering methods of analyzing the twitter data as it classifies data based on density either as low- or high-density points. The density-based methods are based on varying clustering algorithms. . The choice of algorithm is based on the need to improve the quality of the clustering and to improve the weaknesses identified in another algorithm. Below is an outline of the density clustering algorithms used in previous studies successfully.
Kernel Density Estimation – KDE (the Silverman equation)
Kernel density estimation is a useful statistical tool that estimates non-parametric functions by estimating the probability density of a random variable. The model uses varying variables to graphically represent hotspot data from spatial points. The KDE is computed using a formula by weighting the distance of the point from a specified value of k (Kernel function). The KDE leads to plotting of smooth curves (Ram et al., 2010). An example of a bivariate model using the formula below where h is a value of bandwidth for the Gaussian kernel function (K.) as well as the spatial dimension (Koylu et al., 2019). The model is commonly used to identify hot spot areas however; the model generalizes the outputs and gives smooth curves, which may represent some datasets.
Figure 2: KDE Estimation function. Source Koylu et al., 2019
Several improvements have been done on the KDE functions to improve on the gaps identified in the use of the formula. This includes Localized KDE, LKDE algorithm by Al Boni and Gerber, 2016. The model uses localized estimators in the analysis that produce non0smooth density estimates. The interpolation method is not common to all points it varies from cell based on the kernel weight. The method is perceived to be faster than KDE. The algorithm operationalizes by building an overlay grid, counting the incidences per grid, fixing the center for each cell and running a convolutional operation (Al Boni and Gerber, 2016).
Another KDE algorithm developed to improve the performance of the traditional model is the Network Kernel Density Estimation, NKDE. The algorithm is developed to consider the density of an event with regards to the network such as street and it considers overlapped entry points. The model computes the densities independently and afterwards it sums up the density of over lapped points and densities for close obstacles. The NKDE model analyses the point pattern distribution along a network without considering the hypothesis of isotropy of space, homogeneity but it infers to the distribution of events. It is a more precise model for points with network related aspects (Delso et al., 2018).
Density-Based Spatial Clustering of Applications with Noise –DBSCAN
The DBSCAN is a clustering approach that reviews the data based on two parameters that are specific radius (eps) and Minimum points (minPts) (Parimala et al., 2011). The algorithm uses the density of the points to create clusters based on the study topic as well as identify clusters of noises using arbitrary shapes, to distinguish the noise from the elements of study. The algorithm defines the regions of high density and separates them from a region of low clustering using the two thresholds as the defining factors (Gaonkar and Sawant, 2013). The density is defined by counting the number of points within a specified radius and classifying these regions as core, border, or noise points. The minimum number of points defines the classification of these points within the DBSCAN model. This approach is suitable as it is able to identify noise within the datasets since it creates arbitrary shapes, it is considered more accurate in defining extents.
Figure 3: DBSCAN model (a) p density-reachable from q, (b) p and q density-connection by o and (c) identification of core object, border object, and noise. Source (Birant and Kut, 2006)
Varies Density Based Clustering Method – VDBSCAN
Liu et al. developed the varied density algorithm for the purposes of finding varying densities for the dataset (Liu et al., 2007). The algorithm uses a varied set of the specified radius to determine different densities defined by the k-dist plot. The model has five phases: To find and store the k-dist for each object and divide the k-dist plots, define the number of densities given by the k-dist plot, choose parameters of the specific radius for each density automatically. Phase 4 is to analyze the dataset and establish the clusters for the varying densities based on the specific radiuses identified, and to display the valid cluster of the data. The model is capable of handling local density variations within the cluster (29)
The ST- DBSCAN algorithm was developed to improve the DBSCAN by addressing three problems identified; the complexity of clustering spatial-temporal data by combining distance and time variable and identifying noise objects in clusters, and improving identification of close point’s adjacent clusters (Birant and Kut, 2007). The algorithm uses four parameters in the assessment, which include two specified radiuses (Eps 1 and 2 for distance for spatial attributes and distance for non-spatial attributes respectively), MinPts, and the Δϵ that prevents the data in combined clusters due to the small variations in non-spatial values of points within the neighboring locations. The model operates by assigning a density factor to each cluster to enhance the comparison between clusters.
The Map reduced density-based method was proposed by He at al., 2014. The model represented the density output on a map/reduce using the Hadoop platform (Ishwarappa, 2015). The model has three steps for the dataset, which are data partitioning, local clustering and the global merging of the dataset. The algorithm is guided by the concept of load balancing for large-scale datasets and it is efficient in scaling up and speeding the processes of big data that is skewed. Other algorithms have been advanced based on this approach by modifying the processes of the steps. Dai and Li, proposed an improvement of the data partitioning by using reduced boundary points approach to partition. The model provides an algorithm to partition boundaries based on the distribution of the data to the set load balance for each node (Dai and Li, 2012). The Map reduced method is also advantageous as it promotes parallel recessing and it can be executed on a cloud system, and thus does not depend on the global index.
Ordering Points to Identify the Clustering Structure -OPTICS
This was a model developed by Parimala et al. The models aimed to improve the DBSCAN model by assigning varying weights or densities to the data features. The model works similarly to the DBSCAN by creating augmented spatial clusters of points that are ordered using weights.
Overall, the various algorithm approaches are suitable based on the type of analysis. However, the improved versions of the analysis are better as they build on the traditional method and improve the outcome based on the desired output.
Al Boni, M. and Gerber, S. (2016). Area-specific crime prediction models. 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 671–676.
Birant, D., and Kut, A. (2007). ST-DBSCAN: An algorithm for clustering spatial–temporal data. Data and Knowledge Engineering Vol 60 (1); 208-221.
Dai, R., and Lin, C. (2012). Efficient Map/Reduce-based DBSCAN algorithm with optimized data partition. In: Fifth international conference on cloud computing.
Delso, J., Martín, B., & Ortega, E. (2018). A new procedure using network analysis and kernel density estimations to evaluate the effect of urban configurations on pedestrian mobility. The case study of Vitoria –Gasteiz. Journal of Transport Geography, 67; 61–72.
Finch, C., Snook, R., Duke H., Fu W., Tse H., Adhikari A., and Fung I. (2016). Public health implications of social media use during natural disasters, environmental disasters, and other environmental concerns. Natural Hazards. 2016;83:729–760.
Gaonkar, N., and Sawant, K. (2013). AutoEpsDBSCAN: DBSCAN with Eps automatic for large dataset. International Journal of Advanced Computing Theory and Engineering 2(2):2319–526.
Hernandez-Suarez, A., Sanchez-Perez, G., Toscano-Medina, K., Perez-Meana, H., Portillo-Portillo, J., Luis, S., and Javier, L. (2019). Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation. Sensors (Basel, Switzerland), 19(7), 1746.
Ishwarappa, J. (2015). A brief introduction on Big Data 5Vs characteristics and Hadoop technology. Procedia Computing Science 48:319–24.
Koylu, C., Zhao, C., and Shao, W. (2019). Deep neural networks and kernel density estimation for detecting human activity patterns from geo-tagged images: A case study of birdwatching on flickr. ISPRS International Journal of Geo-Information.
Liu, P., Zhou, D., and Wu, N. (2007). VDBSCAN: varied density based spatial clustering of applications with noise. In: International conference on service systems and service management, Chengdu.
Matheson, D. (2018). The performance of publicness in social media: Tracing patterns in tweets after a disaster. Media Culture. Society, 40:584–599.
Parimala, M., Lopez, D., and Senthilkumar N. (2011). A survey on density based clustering algorithms for mining large spatial databases. International Journal of Advanced Science and Technology 31:59–66.
Ram, A., Jalal, S., and Kumar, M. (2010). A density based algorithm for discovering density varied clusters in large spatial databases. IJCA 3(6):1–4.
Sakaki T., Okazaki M., Matsuo Y. Earthquake shakes Twitter users: Real-time event detection by social sensors; Proceedings of the ACM 19th International Conference on World Wide Web; Raleigh, NC, USA. 26–30 April 2010; pp. 851–860.
Zhang W., Yoshida T., and Tang X. (2011). A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Syst. Appl. 38:2758–2765. doi: 10.1016/j.eswa.2010.08.066.
You can trust our GIS assignment help or arcgis online help. Our writers can analyze geo-tagged twitter data using various algorithms, including DBSCAN, ST-DBSCAN, MR-VDBSCAN, KDE (the Silverman equation), and adaptive KDE (like Brundon’s algorithm). This explains why our customers trust our services. You might want to try our online writing guidance now. We will be happy to help you.