• Không có kết quả nào được tìm thấy

Zheng Xiang Daniel R. Fesenmaier Editors

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Chia sẻ "Zheng Xiang Daniel R. Fesenmaier Editors"

Copied!
309
0
0

Loading.... (view fulltext now)

Văn bản

Big data brings new opportunities to modern society (Fan, Han, & Liu, 2014) as these vast new repositories of information can provide researchers, managers and policy makers with the data-driven evidence needed to make decisions based on numbers and analysis. rather than anecdotes, guesswork, intuition or past experience (Frederiksen, 2012), and this can lead to more accurate analysis, more confident decision-making and greater operational efficiency, cost reductions and risk reductions (De Mauro, Greco, & Grimaldi, 2015). The value of tourism big data can be described by its new application in the tourism industry.

Graph database: designed to store and represent data that  utilize a graph model
Graph database: designed to store and represent data that utilize a graph model

Consumer Behavior

In tourism, every question must be answered immediately to remain relevant to travellers, and this is what makes big data so important. With the strong growth in the number and applications of big data, traditional tourism data and methods will be linked to the new data and methods: for example, call centers will be linked to online consumer reviews; loyalty programs will be linked to booking histories; And.

Feedback Mechanisms

We now move on to the most important step of using big data in tourism forecasting, as we know that big data can bring many benefits to the tourism industry.

Capturing Big Data for Tourism Forecasting

The rise of "big data" on cloud computing: Review and open research questions. Information systems. The economics of big data: A value perspective on the state of the art and future trends.

Figure 2 displays the framework of tourism forecasting with big data. There are three important steps: (1) data exploration, which is the data processing that prepares the proper data for the model; (2) use modeling techniques to predict user behavior on t
Figure 2 displays the framework of tourism forecasting with big data. There are three important steps: (1) data exploration, which is the data processing that prepares the proper data for the model; (2) use modeling techniques to predict user behavior on t

Heterogeneity in Tourists

However, one of the weak points of the Revealed Preferences Approach derives from the fact that the estimation of preferences is made at a global sample level, which does not allow representation of individual level preferences. One of the procedures proposed in the literature to incorporate heterogeneity of preferences assumes the existence of differentiated response parameters for each individual.

Fig. 2 Individual revealed preferences through choices made
Fig. 2 Individual revealed preferences through choices made

Choice Set

Moreover, to segment the market, the discrete distribution has an advantage over the continuous distribution in that there is no need to assume a concrete probability distribution, since the segments are obtained through empirical data. However, the discrete approach has two important limitations (the Allenby & Rossi estimate becomes complex with six or more probabilities in the mass, which prevents capturing the full heterogeneity of the sample; and (2) the inability to identify the preferences of established individuals beyond a certain threshold of the distribution function (eg in the tails of the distribution).

Information Hierarchy

In fact, this model would allow estimating the expected demand (of clicks) on a given day. Another important challenge for measuring the traveler experience is related to the fundamental nature of the tourism experience.

Fig. 3 Hierarchical hotel decision with n destinations and m hotels
Fig. 3 Hierarchical hotel decision with n destinations and m hotels

Vision

For example, vision is more related to action (Schifferstein & Cleiren, 2005), while the sense of smell has stronger emotional responses (Schifferstein & Desmet, 2007), and the sense of touch plays a more important role in social relationships. For example, Google Glass is equipped with a miniature computer and has many sensors, including a front-facing camera for taking pictures and a basic eye tracker that continuously collects the user's visual perception data (Ishimaru et al., 2014).

Hearing

SoundSence, for example, collects sound data using microphones from smartphones and can be used to classify 'meaningful' events from it (Lu, Pan, Lane, Choudhury and Campbell discuss systems that use crowd-sourced sound technology so that their low-cost data collection and analysis through massive coverage in both space and time for observing places and events (Kanhere, 2011).

Smell

Taste

Several studies have examined the impact of taste on human perception, behavior and memory (Krishna, 2012). The sense of taste in the tourism context is often seen as part of the food consumption process (Kim, Eves, & Scarles, 2009).

Touch

Recently, studies have focused on developing sensors for use in medicine and sports, which include flexible pressure sensors to measure the sensation of touch (Dahiya et al., 2010). In particular, special types of fibers with textile sensors have been used to measure the sensation of touch throughout the body (Ugur, 2013).

Table 1 Sensory modalities and measurement sensors for touristic experience Sensory
Table 1 Sensory modalities and measurement sensors for touristic experience Sensory

Other Somatosensory Modalities: Movement, Temperature, and Pain

The dynamic nature of tourism experiences is one of the main obstacles to capturing the traveler's sensory experience. Exploring the conceptualization of the sensory dimension of tourism experiences. Journal of Destination Marketing & Management.

Table 1 Quantified-self categories and measures
Table 1 Quantified-self categories and measures

The Quantified Traveler and Context-Awareness

This means that many people are actively using 'sensors' to collect real-time data about their surroundings as well as their physical/emotional state (and stored personal historical data), which in turn creates massive amounts of data. that strongly support individual decisions; for example, many outdoor enthusiasts collect and share information about birds, consistently collect weather data for local reporting, or search the skies for observations of new phenomena (Goodchild, 2007). Importantly, changes in context and subsequent behavior (in terms of spatial/temporal movements) can transform the way travelers communicate and/or experience a destination (Kim & Fesenmaier, 2014; Yoo et al., 2008).

The Quantified Traveler and Ordinary Life

In the following, some possible uses of the concept of the quantified traveler in the development of smart tourism are defined. The concept of the quantified traveler offers both opportunities and challenges for the tourism industry.

Fig. 2 Data sharing and feedback loop in the ‘ Quantified-Self ’ community
Fig. 2 Data sharing and feedback loop in the ‘ Quantified-Self ’ community

Lessons from Private Sector Tourism

PPL reservation data includes as a minimum destination data, visitor data, order data, start dates, end dates and administrative information (e.g. fees paid). Without the data that data service companies provide, PPL managers can use internally collected historic reservation data to identify many of the same destination-specific attributes, including top markets, length of stay (duration), booking to arrival time (lead time), and average visitors per booking.

Preprocessing and Enriching PPL Reservation Data

The accuracy of the visitor's origin location in the booking dataset will dictate the resolution of the secondary attribute data that can be merged. This amenity data, when available, can be joined with booking data either geographically or through common identification codes.

Data Reduction & Geographic Data Mining

Space is then used to control measurement of the feature for the fixed time frame. Once space is controlled and time is established, various properties can be measured for the entire data set or for a subset of the data.

Utilizing Information Generated from Geographic Data Mining

40,000 zip codes and zip code centers of gravity, while there are only ~30,000 tabulated zip code areas or polygons. The spatial merging process allows attributes to be counted, summed, or averaged within the tabulated zip code areas.

Geovisualization for Pattern Interpretation of PPL Demand Populations

To ensure minimal data leakage, the booking attributes should be summarized by visitor origin and first combined with the postcode center coordinates. Then, using GIS, all postcode centroids that fall within a postcode tabular area can be spatially joined to that area.

Geovisualization for Pattern Interpretation of PPL Destinations

Geovisualization of the 25th and 75th quantiles for travel lead time (not shown here) also show similar patterns. With the passage of the Digital Accountability and Transparency Act of 2014 (DATA Act), there are now legal requirements to provide open data and many federal agencies are working to overcome the challenges associated with data sharing (panel discussion on the changing the culture for open data, 2015).

Figure 1 shows the percent of total reservation records (January 1, 2008–
Figure 1 shows the percent of total reservation records (January 1, 2008–

Supervised Machine Learning Approaches

Results indicate that support vector machines and N-gram approaches outperform the Naı¨ve Bayes approach. Results repeatedly indicate that support vector machines outperform the Naı¨ve Bayes technique (Alves et al., 2014, p. 123).

Dictionary-Based Approaches

Similar to the study by Ye et al. 2009), Alves, Baptista, Firmino, de Oliveira and de Paiva (2014) compare supporting vector machines with Naive Bayes classifications to perform sentiment analysis of tweets (i.e. written in Portuguese) during the FIFA Confederations Cup 2013. In addition, the paper provides an interesting example of a possible application of OpenER to the geolocation of hotel reviews (Pablos et al., 2015, p. 125).

Unsupervised Machine Learning Approaches

Finally, the proposed approach also includes a taxonomy to classify fragments according to their topic using a list of lemmatized and normalized words, each of which belongs to a different topic category (Garcı´a et al., 2012, p. 35). Evaluation of the classification based on test data shows that the proposed system performs better compared to a predefined baseline: if a customer's rating is manually classified as good or bad, the classification is correct with a probability of about 90% (Gra¨bner et al., 2012, p . 460).

Semantic Approaches

Their study applies a text analytic approach to a large volume of consumer reviews obtained from Expedia.com to deconstruct the hotel guest experience and examine its association with satisfaction ratings. The semantic link between guest experience and satisfaction appears to be strong, suggesting that these two domains of consumer behavior are intrinsically linked (Xiang, Schwartz, Gerdes et al., 2015, p. 120).

Hybrid Approaches

Possible approaches to perform supervised topic detection are dictionary-based approaches or supervised machine learning techniques (or more concrete classification techniques, such as Naive Bayes or Support Vector Machines [SVM]). Unsupervised topic detection typically takes place at the level of single words and thus can identify multiple topics within the same phrase or sentence (although it can also be aggregated at the sentence level, which is particularly meaningful if the topic detection is to be combined with a sentiment detection that takes place at the sentence level) .

Supervised Topic Detection

Unsupervised Topic Detection

We evaluated the approach above on the hotel reviews for hotels in Åre, and the 40-factor LSI achieved an accuracy of 88.39. In this case, review statements are no longer represented as a word vector (i.e. a bag of words), but each individual word is stored together with its context, thus a certain number of words before and after the word, as well as word characteristics, such as its position in the sentence, its length, etc.

Subjectivity Detection

As the examples show, problems arise when the statements are ambiguous (for example: this can save costs for families with children) or contain a mixture of different opinions (for example:

Sentiment Detection

The final outcome of the sentiment analysis provides valuable information about customer reviews and opinions in a structured format. It was recognized in this research that positive reviews are likely to be more favorable than negative comments, and heuristic cues from online reviews lead readers to increase the perceived helpfulness of the reviews.

Table 6 shows some examples for the sentiment detection. Here again we can see, that problems occur, if either statements contain multiple opinions with a different sentiment (e.g
Table 6 shows some examples for the sentiment detection. Here again we can see, that problems occur, if either statements contain multiple opinions with a different sentiment (e.g

Poisson Estimation

Negative Binomial Estimation

One way to verify the validity of the negative binomial model against the Poisson model is to test the null hypothesis α¼0. Thus, this study assesses the suitability of models between the Poisson and negative binomial models for understanding the characteristics of the data distribution.

Model Estimations

Therefore, to address the limitations of the Poisson modeling, this study applied an alternative counting model based on a negative binomial distribution (Cameron & Trivedi, 2013). One way to verify the validity of the negative binomial model, as opposed to the Poisson model, is to test the null hypothesis (i.e., dispersion parameter¼0 denoting α in the equation discussed in the literature search), which reflects equality of mean and varianceE( yt)¼V(yt) .

Measurement

In addition, the location of the restaurants was added as another control variable to test the potential confounding effect on the results (1¼London and 0¼New York). It was identified that the model possesses heteroscedasticity, potentially leading to the misrepresentation of the estimated variances of the coefficients compared to relevant true variances.

Analysis of Count Models

However, it is important to consider a critical limitation of the Poisson model, such as over- or underdispersion. When comparing the unconditional mean and variance of the dependent variables (see Table 2), the results do not show equidispersion.

Assessing the Effect of Star Ratings on Review Evaluations

This chapter examined possible asymmetries in the effect of online reviews on usefulness and satisfaction and suggested the use of the negative binomial model as an appropriate method to cope with count data. The Impact of Internet Customer Reviews on Online Sales and Pricing in the Hotel Industry. The Service Industries Journal.

Factual Knowledge

They address important questions for tourism decision makers: What are the driving factors that influence a destination's reputation among bloggers, journalists and social media users. Contextual information can guide content acquisition of tourism-related content through focused crawling (Mangaravite, Assis, . & Ferreira, 2012), for example, increase the accuracy of knowledge extraction algorithms tailored to the specifics of the content created. by users, or help to understand the role of affective knowledge in the decision-making process (Hoang, Cohen et al., 2013).

Affective Knowledge

Through the use of the radar map the visualization of the public discourse on destination and Aakers' Fig. A model of destination branding; Integration of the concepts of branding and destination image. Tourism Management.

Fig. 1 Screenshot of the tourism monitor Web intelligence platform, showing a query on
Fig. 1 Screenshot of the tourism monitor Web intelligence platform, showing a query on

OTR Title

This phenomenon of an extended perceived image conveyed by OTR titles can be understood in the context of the two-way mutual influence of projected and perceived images (Marine-Roig, 2015a), where tourists reproduce perceived images through their actions and transfer. - tion to others, thus closing the hermeneutic circle of images (Caton & Almeida, 2008). However, it is important to note that in OTR sites, titles are part of the paratextual elements and review web host also adds information to the same titles, so this should also be considered in terms of destination imaging.

Another OTR peritext

Azariah (2011) points out that those analyzing travel blogs to study destination imagery should recognize the web host's contribution to positioning the blog as a travel story. However, the authors point out that the star rating data is insufficient to understand the tourist experience and that this data should be combined with the analysis of the rating description (text and title).

Destination Choice

Specifically, the mean rating of an author's historical ratings can be used to infer baseline attitudes toward travel reviews, either positive or negative. For example, the writers from TravelReview scored the quantitative overall star rating out of five, plus the facilities-type-specific ratings out of five (such as cleanliness and service for accommodations).

Webhost Selection

France recorded 32.4 million travelers who generated 66.3 million overnight stays (CRT, 2016) and Catalonia 17.6 travelers who spent 52.0 million (IDESCAT, 2016).

Data Collection

OEE stores the OTR content and records related attractions on the local hard disk, and keeps information about its hyperlinks in the file names.

Web Data Mining

The remaining peritext fields of the Review, Reviewer, and Withdrawal tables (Fig. 1) are similarly obtained by entering regular expressions in the search box of the RSP tool, but first, to prevent RSP from collecting redundant information, added related revisions (epitext). by the webmaster on the page of each article should be removed.

Data Arranging

Parser Settings

The parsing algorithm follows these steps: (1) Loads input text and list files and converts text to lower case; (2) Reads compound words, removes them from the input text and adds them to the set with the frequency if greater than zero; (3) Divides the remaining text into words according to the separators; (4) For each word, it increments the whole word counter, and if it has a length of more than two characters and is not blacklisted, it adds it to the set (if the word was already in the set, raises its frequency); and (5) Creates a CSV file with three columns: whole word; frequency; and percentage related to all words of the input text including stop words.

Categorisation

The first similarity is that the most frequent word in both UGC + WGC and UGC are the capitals of the regions: Paris in the case of Ile-de-France and Barcelona (1st in UGC + WGC and 2nd in UGC) in the case of Catalonia. But it is noteworthy that in Cat's case there is a word that relates to a great extent to the region's main attractions and to its tourist identity, which is the architect Gaudi.

Hình ảnh

Graph database: designed to store and represent data that  utilize a graph model
Figure 2 displays the framework of tourism forecasting with big data. There are three important steps: (1) data exploration, which is the data processing that prepares the proper data for the model; (2) use modeling techniques to predict user behavior on t
Fig. 3 Hierarchical hotel decision with n destinations and m hotels
Fig. 4 Hierarchical hotel decision with m hotels and n destinations
+7

Tài liệu tham khảo

Tài liệu liên quan