1. Introduction
Much empirical research has shown that news events have important impacts on the stock market. According to Yin’s classification standard [1], news can be divided into specific news and general news. Specific news refers to news stories where the affected stock entities are clearly specified in the news text, while general news refers to stories where they are not. Specific news usually involves announcements about stocks. General news includes industry news, policy news, microeconomic news and so on. When analyzing the impact of general news on the stock market, we first need to determine the stock entities that may be affected by such news. Due to the different processing methods for the two kinds of news, this paper only focuses on stock announcement news, which can reflect the recent development of a listed company. It can assist investors in making decisions and can be used in stock return predictions. Determining the event type is one of the main tasks of event extraction. The announcement news covers all aspects of information about listed companies, so it is a challenge to build a maturity event type framework from stock announcement news.
To date, there is no unified classification framework or standard for the news announcements regarding the Chinese stock market. Therefore, the existing research has constructed various event type frameworks using expert domain knowledge and experience [2], clustering [3], ontology [4], and other methods [5]. Some studies have implemented a fine-grained event type framework. Following an analysis of the existing studies, we believe that there is still room for improvement. The existing methods usually focus on the event types that occur more frequently or are generally considered important. Some low-frequency event types are usually neglected, and some event types can be further subdivided. For example, an event type called legal has been constructed in many studies, which regards violations of company policy and violations of the law as being the same. However, the two will receive different penalties, resulting in different expected impacts on investors. Therefore, we think it is more reasonable to regard the two as different event types.
Inspired by the “IDA-CLUSTERING + HUMAN-IDENTIFICATION” strategy [6], we propose a two-step method to divide stock announcement news into more detailed types. Event trigger words play an important role in the event type, which are usually verbs. By combining these with the conventional expressions of a certain kind of announcement news, we can extract expressions from an announcement news set containing event trigger words. In order to take into account the industry characteristics and event emotional tendency, we propose three event type judgment criteria to determine the final event type. The experimental results of real data on the Chinese stock market show that the event type framework constructed in this paper is reasonable and consistent with people’s cognition. Compared with the existing related research results, our method finds some reasonable and valuable event types that have not been discussed yet. Our work enriches the existing research, and the results will help investors.
After extracting all kinds of announcement news events, we did not choose to conduct the stock prediction work in the traditional way. This is due to the fact that we think it is inappropriate to rely solely on announcement news for prediction without considering other types of news such as industry news, financial news and so on. Instead, considering the unilateral trading policy in the Chinese stock market, we screened out some event types that are not valuable to investors.
2. Related Research
Event extraction is a typical task in the field of NLP that has been widely studied in the past. Due to the subject of this paper, we focus on event extraction methods in the economic domain. The ACE event typology “business”, which has four subtypes (Start-org, Merge-org, Declare-bankruptcy, and End-org), is relevant to the economic domain. The ACE event type definition does not meet our requirements. Therefore, researchers have proposed various methods to categorize types of financial events according to actual situations.
Fung et al. [7] classified financial news into two simple types of events: stimulating stock rise and stimulating stock fall. Wong et al. [8] carried out similar work, which used a method based on feature words and template rules to identify three types of stock opinion events (rising, stable and falling). Du et al. [9] proposed a PULS business intelligence system, which detected 15 event types pre-categorized as “positive” or “negative”. Chen et al. [10] proposed a fine-grained event extraction method and applied it to the stock price prediction model. Firstly, a professional financial event dictionary (TFED) was constructed manually by experts. The event type, event trigger word and event role were determined by the dictionary, and the event was extracted using the template rules. The abovementioned research did not separate stock announcement news from financial news, opting instead to combine the two. Events are classified and used as inputs for the prediction model, so the classification is usually rough. Liu [11] proposed a method for discovering financial events that affect stock movements. Firstly, 13 types of financial events were manually determined according to the industry characters; then, the keywords in the constructed financial ontology were used to annotate the text. Liu’s work classified financial news according to industry characteristics, and the types of construction are biased towards industry news.
The Stock Sonar project expert-created event typology identified eight event types: “Legal”, “Analyst Recommendation”, “Financial”, “Stock Price Change”, “Deals”, “Mergers and Acquisitions”, “Partnerships”, “Product”, and “Employment” [12]. The author focused on the event types in stock announcement news, but the number of designed event types are small and the coverage is not wide. He [13] constructed a stock market theme event case base through ontology. The theme event types included financial policy events, monetary policy events and market rule adjustment with multiple subtypes. The subject event was defined in triple (event description, market description, event result). It can be seen from the construction types that the author focuses on three types of macroeconomic events and did not focus on the stock announcement news. Wang [14] constructed a corpus of 2500 news texts that were manually divided into two categories and six sub-categories. Then, based on semantic, grammatical and syntactic features, the SVM method was used to identify event types. Chen [15] implemented an event extraction system in the financial field. Firstly, the system manually determined eight event types and selected seed event sentences for each event type. Then, the seed event trigger words were extracted using verb object relationship and subject predicate relationship and were extended by word2vec to obtain the event trigger word dictionary. Han et al. [16] proposed a method for event extraction in the business field by combining machine learning and template rules. Firstly, a business event type framework was defined manually, in which business events were divided into 8 categories and 16 sub-categories, and a small number of event triggers were constructed. Then, the trigger word dictionary was extended via word embedding to identify event types through multiple classification models combined with the trigger word dictionary. References [14,15,16] manually classified the event type from the financial news while paying close attention to the design of the event recognition model. Boudoukh et al. [17] identified 18 event categories based on Capital-IQ types and a cross-section of academics.
Arenarenarenko et al. [18] proposed an event extraction system named BEECON (Business Events Extractor Component based on the ONtology) for business intelligence. The system can identify 11 types and 41 sub-types of business events from news texts using template rules. The experimental results verified that the system had high accuracy (95%). Although the author built a rich and fine-grained event type framework, which includes some news on the stock market, he focused on the events in the business domain, and the coverage of stock announcement news event types was not comprehensive enough. Zhang [19] proposed an event-driven stock recommendation model. The financial events are manually classified into 12 categories and 30 sub-categories. The fine-grained event type framework constructed by the author was all centered around stock announcement news. It covered most of the events in the announcement news, but also ignored some low-frequency event types, such as winning bid events. In addition, as we mentioned earlier, some event types can be further subdivided. In terms of event recognition, the author’s accuracy on the domain data set (67.3%) was much lower than the method that used template rules in [18] (95%). The template rules method can usually achieve high precision but requires much energy and expert experience. Some researchers consider automatic template rule generation and use a small amount of training corpus and seed templates through weak supervision, bootstrapping or other methods to automatically generate more templates [20].
Zhou [21] implemented a financial event extraction system based on deep learning. In the system, experts manually divided types of financial events (4 categories and 34 sub-categories) and built two kinds of relationships tables between financial entities (personnel to enterprise, enterprise to enterprise). The author constructed a detailed event type framework around stock announcement news. From the classification of the first layer types, the coverage was not wide (far less so than that of [19]). However, the author divided the sub-types in a very detailed way, which is better than the divisions used in [19]. Wang et al. [22] proposed a bond event element extraction method based on CRF. The event element framework was manually predefined and included bond event type and an event element list. Ding et al. [23] proposed a method to extract events from financial reports. Due to the standard writing of the financial report text, it takes the titles at all levels as the event category and the paragraphs under the title as the extraction unit. The author constructs event types according to the characteristics of financial reports, and the method is not suitable for stock announcement news. Wu [20] used the improved TFIDF algorithm to calculate the weight of text eigenvalues, then clustered the text using the K-means method. Finally, the most appropriate K = 13 value was selected by listing. The 13 event types included: issuance, dividend, event prompt, pledge, performance notice, suspension and resumption of trading, fund-raising, increase or decrease of holdings, financial report, investment in subsidiaries, abnormal fluctuation, asset reorganization and change registration. The author used the clustering method to construct event types from stock news. Although some event types could be found, some event types, especially those with low frequency, are easily ignored.
The event study method is also widely used by researchers to analyze the impact of news events on the Chinese stock market, which was initiated by Ball and Brown (1968) and Fama et al. (1969). It is essentially a statistical analysis method. The basic idea of the event study method is to select a certain type of specific event according to the research purpose, calculate the abnormal return index in the event window period, and then explain the impact of specific events on the change in sample stock price and return. There have been many achievements in the research on the Chinese stock market involving many types of stock announcement news events, such as monetary policy, industry related policies, epidemic situations, explosion accidents, earthquakes, avian influenza, the Shenzhou spacecraft launch, negative reputations, food safety accidents, environmental pollution, performance forecast events, corporate mergers and acquisitions, the lifting of stock bans and so on [24,25,26]. Besides stock markets, news event study also plays an important role in commodity markets [27,28].
3. Proposed Method
3.1. Extracting Event Trigger Words
Event trigger words are key words that help us to identify event types, which are usually verbs. Firstly, this paper proposes an algorithm and a support calculation formula, which takes all stock announcement news texts as the input, extracts all verbs from the text, marks the emotional polarity according to the emotional dictionary, takes the verb as a candidate event trigger word and takes the announcement news containing the verb as a class. Then it calculates the support between the other words and the verb, takes the other words that meet the threshold as collocations and judges the word order between collocations. Finally, it extracts candidate event trigger words and co-occurrence words and arranges them in the order of common expressions. It can be described as Algorithm 1:
Algorithm 1: Extract Candidate Trigger Words and Collocations from Announcement News |
|
The function of Formula (1) involves calculating the support between words and the verb, where CountB() represents the number of times the word appears before the verb, and CountA() stands for the opposite. If the absolute value of Formula (1) exceeds the threshold, this means that it is a conventional expression with a verb in the announcement news. If the result is positive, it means the word is usually in front of the verb. If the probability of a word appearing before and after the verb is close, it means that the word has no value in the representation of the event type.
(1)
3.2. Three Classification Criteria
Based on the results of the above Algorithm 1, this paper puts forward three criteria to judge whether it constitutes the final event type from the perspective of data mining. The purpose of the three criteria is to select regular announcement news from the stock announcement news set containing verbs and construct it into a type. When constructing the event type of stock announcement news, the event extraction template is determined according to the criteria. The three criteria are as follows:
(1) For verbs without emotional polarity, if there is a collocation with more than 0.95 support around the verb, combine the collocation with the verb as the event trigger words. For example, there is a collocation of “扩股/capital increase (0.99)” to the left of the verb “增资/share expansion”, so “增资扩股/capital increase and share expansion” is used as the event trigger words. If the event trigger word itself contains independent type information, it is determined as a type of event, and the event trigger word is directly used as the extraction template. If the event trigger word does not constitute independent type information, the collocation words whose support exceeds the threshold and the event trigger words are constructed as a type of event. The form of word combination is used as the event extraction template.
We take the “垃圾焚烧/garbage burn” event as an example to illustrate the advantages of the classification method. Firstly, through the algorithm, the output results about the verb “焚烧/burn” are as follows:
[中标/winning the bid (0.38)-环保/environmental protection (0.54)-生活/domestic (0.61)-垃圾/garbage (0.93)-焚烧/burn-发电/power generation (0.90)-项目/project (0.93)-投资/investment (0.45)-建设/construction (0.24)-处理/treatment (0.32)]
Since there is no combinatorial collocation around the word “焚烧/burn”, the word “burn” itself is used as an event trigger word, and the word itself does not constitute independent type information. Therefore, the type is constructed together with the collocation whose support exceeds the threshold, which forms the “garbage burn” event. The extraction template is:
[…]垃圾/garbage […]焚烧/burn […]发电/power generation […]项目/project […]
Through the event extraction template, the announcement news events related to environmental protection can be screened out from the meaningless stock announcement news containing “burn”. For example, announcement news 3 (shown below) can be excluded.
公告新闻3:辉丰股份(002496)公告,子公司华通化学收到环保局出具行政处罚决定书,要求对吡氟酰草胺项目责令限期改正,对RTO废气焚烧装置责令立即停止建设。
NEWS 3: (SZ002496) announced that Huatong Chemical, a subsidiary company, received the decision of administrative punishment issued by the Environmental Protection Bureau, demanding it to order the correction within a time limit for the pyruvic oxalamide project and order the construction of RTO waste gas burn unit to stop immediately.
At the same time, the advantages of the event extraction template compared with the word similarity calculation method and clustering method can be seen in announcements 4 and 5.
公告新闻4:绿色动力(601330)11月28日晚间公告,公司成为葫芦岛东部垃圾焚烧发电综合处理厂生活垃圾焚烧发电项目的社会资本合作方,项目估算总投资不超过6.3亿元。
NEWS 4: (SH601330) announced on the evening of November 28 that the company has become a social capital partner of the domestic garbage burn power generation project of the waste incineration power generation comprehensive treatment plant in the east of Huludao, with an estimated total investment of no more than 630 million yuan.
公告新闻5:城发环境(000885)9月26日晚间公告,公司为宜阳县生活垃圾焚烧发电项目的中标人,项目总投资3.60亿元,项目合作期30年,含2年建设期。
News 5: (SZ000885) announced on the evening of September 26 that the company is the bid winner of Yiyang domestic garbage burn power generation project, with a total investment of 360 million yuan and a cooperation period of 30 years, including a two-year construction period.
(2) If the verb or the collocation around the verb is combined to form an industry characteristic word, the verb or the combined words are used as the event trigger words, and the event trigger words are used as the event extraction template. Taking the “评价/evaluate” verb as an example, the output of the algorithm is:
仿制/Imitation (0.50)-药/medical (0.76)-药品/drug (0.41)-制药/Pharmaceutical (0.57)-通过/pass (0.67)-收到/receive (0.33)-一致性/consistency (0.78)-评价/Evaluate
The word “evaluate” itself does not constitute event type information, but it becomes a word with the characteristics of the pharmaceutical industry after being combined with the collocation word “consistency”. The introduction of “consistency evaluate” in Baidu Encyclopedia is as follows:
“Drug consistency evaluation” is a drug quality requirement in the 12th Five Year Plan for national drug safety, that is, the state requires that the imitated drugs should be consistent with the quality and efficacy of the original drugs. Specifically, it is required that the impurity spectrum is consistent, the stability is consistent, and the dissolution law in vivo and in vitro is consistent.
(3) The verbs with emotional polarity are screened, the words with clear semantics are retained as event trigger words and the event recognition template is constructed according to the trigger words. For example, emotional words such as “支持/support, 通过/pass and 指导/guide” are filtered out, and words such as “犯罪/crime” and “违纪/violation of discipline” are retained.
3.3. Event Types of Chinese Stock Announcement News
We used data on Chinese stock announcements collected from the EASTMONEY website (
The processing flow of our model is shown in Figure 1.
4. Experimental Verification
4.1. Data Description
The main problem faced by experiments on event extraction methods in a specific domain is a lack of a unified corpus and type division standards. Existing studies generally label the experimental data manually and then verify the event extraction method on the labeled data set. The purpose of this section is to verify whether the classification of event types in the stock announcement news proposed by this paper is reasonable. Therefore, we first build the evaluation dataset and randomly select 60 stock announcements for each type of event, of which 30 meet the event identification template (the actual number shall prevail if less than 30). If the event identification template is in the form of a word combination, then the remaining announcement news is extracted from the announcement news that contains event trigger words but does not meet the identification template. For example, the announcement news that contains the word “焚烧/burn” is selected from the garbage burn event. If the recognition template is in the form of non-compound words, it will be randomly selected from other announcements. Finally, each evaluation sample contains 54 × 60 = 3240 announcements.
We select five teachers from Southwest University of Political Science and Law who hold doctoral degrees or a vice senior academic title and have more than three years of practical experience in the stock market as the evaluators. A random evaluation sample is generated for each evaluator. In the evaluation sample, an example announcement is provided for each type of event. The evaluator marks the announcement news similar to the example as 1 and those not similar as 0.
4.2. Evaluation Results
In this paper, the precision p, recall R and F values are used to calculate the results of five evaluation samples, and then the average value is taken as the final experimental result. The formal definitions of p, R, and F are as follows:
(2)
(3)
(4)
The final experimental results are shown in Table 2.
From the overall results, all event types identified have an average p value of 0.927, R value of 0.969 and F value of 0.946, which shows that the type of announcements constructed in this paper is reasonable. From the individual point of view, the p value of some event types is poor, far lower than the average value. Through discussions with the evaluators, we found that the reasons are as follows:
The average p value of “signing” is 0.796, which is far lower than the average since one of the legal professional evaluators believes that there is a semantic difference between the word “签署” and the word “签订”, although the two meanings are very similar.
The average p value of the “profit” event is 0.743, which is much lower than the average since the specific information about the “profit” event is described in detail in the sample text. Announcements that do not describe the specific information of profit in detail shifted the focus of the evaluators.
The average p value of “impairment” is 0.752, which is much lower than the average value. The reason is that the sample text describes the event of “asset impairment”. The evaluators exclude the remaining “goodwill impairment” and “accrued (excluding asset) impairment” from the type, so the p value is low.
The average p value of “planning” is 0.759, which is much lower than the average since the example text contains the word “major event” in addition to the planning trigger word. The evaluators believe that “major events” play an important role in representing the planned events, so they marked the evaluation text without “major events” as different.
4.3. Comparison to Existing Results
At present, there have been few studies focusing on event extraction from stock announcements [12,19,21], and more studies are focusing on event extraction from financial news. We select some representative related studies and list them in Table 3 for comparison. We roughly divide the methods of generating the event type framework into two categories: full-manual and semi-manual. Full-manual means that the event type framework is completely determined by domain experts; semi-manual means a combination of some model or algorithm and manual identification.
Due to a lack of a unified event type standard and framework for stock announcements, we cannot compare our work with the existing related studies with numerical indicators. Based on the fact that we have built a fine-grained event type framework, we mainly compare our work with [18,19,21].
The model proposed in [18] identifies 11 types and 41 sub-types of events from the business domain. The p value is 0.95, and the F value is 0.79. References [19,21] both use full-manual methods to determine the event types, and both focus on the stock announcement news. The event type framework in [19] includes 12 types and 30 sub-types of events. Reference [21] includes 4 types and 34 sub-types of events. The p value in [19] is 0.673, and the F1 value is 0.60. The p value in [21] is 0.967, and the F value is missing in the paper. The p value of our work is 0.927, which is lower than those of [18,21] but higher than that of [19]. The F value of our work is 0.946, higher than [18,19].
From the classification results, the fine-grained event type framework built in this paper finds some reasonable and valuable event types that have not been discussed yet. An example is the violation of company policies and violation of the law discussed in the previous section. Another interesting example is that we built an event called throughput, which is usually issued by listed companies in the airline or port sector. According to our knowledge, this event type has not been discussed in existing studies, which only list a related type called “performance change”. Technically, the throughput event is actually a sub-type of the performance change. Performance change news is usually announced on a quarterly, semi-annual and annual basis, and thus cannot reflect short-term changes. Unlike listed companies in other sectors, throughput events usually involve the company’s main business. For example, Air China’s (SH601111) passenger transport business achieved a revenue of 58.317 billion yuan in 2021, accounting for 78.24% of the operation revenue; CMB Shekou’s (SZ001872) port business accounted for 95.76% of its revenue in 2021.
Due to various limitations, the event type framework constructed in this paper cannot be directly compared with those of other studies. However, through the analysis of the results we did find some event types that have not been discussed in the existing literature, and these types are effective and reasonable. Therefore, we can say that the event type framework constructed by the method proposed in this paper enriches the existing research, and thus it has certain value and significance.
5. Filtering of Event Types
In this section, we did not choose to conduct stock return prediction in the traditional way since we think it is inappropriate to rely solely on stock announcement news. Instead, based on the unilateral trading policy of the Chinese stock market, we filtered out some event types that are not valuable to investors. Our approach is that after the announcement news is released, we enter the market and sell at the highest price in the short term. Although in real life, the possibility of selling at the highest price is very low, here we are describing the best-case scenario. We believe that if in the ideal situation the return obtained based on a certain event type is small or the probability is low, combined with the unilateral trading policy, we think such an event type is not valuable to investors.
5.1. Return Calculation Method
If the official announcement time of the event is between the opening in the morning of the day and the closing in the afternoon of the day, mark the day as t = 1. Otherwise, mark the first trading day after the event as t = 1. We consider the best return and probability of entering the market at t = 1 and selling stocks at t = 2 or t = 3. This paper selects three kinds of entry prices: opening price, closing price and the highest price in the worst case, and sells the stock at the highest price of the day at t = 2 or t = 3 to obtain the return. Although the probability of selling at the highest price is small in reality, the purpose of this paper is to provide a reference for investors according to historical data based on the probability of obtaining the best return value. If the return value or proportion is low under the best-case scenario, this indicates that making decisions based on this kind of news is risky and has no investment value. The calculation of the best return obtained at three entry prices is as follows:
(5)
(6)
(7)
DRETOH2 refers to the return from selling stocks at the highest price at t = 2 after entering the market at t = 1; similarly, DRETOH3 is expressed as the return of entering the market at t = 1 and selling stocks at the highest price at t = 3. DRETHH2 and DRETCH2 represent the best return when entering the market at the highest price and closing price of the day at t = 1 and when selling stocks at the highest price at t = 2, respectively.
5.2. Investment Results
According to the 54 types of announcement events constructed in this paper, the transaction data within the time span from March 2015 to December 2019 are selected. In this period, by using the above return calculation method, the results of some types of investment return are shown in Table 4.
Due to space limitations, we only list the results of several event types in Table 4. Eight event types have low returns or probabilities, even under the best conditions. Among these eight types of events, the event types with the smallest benefit value are “Illegal” and “Restructuring” which only come with an average positive return of 0.8%. The remaining events in the order of return from small to large are: “Expiration” (0.9%), “order to correct” (1.0%), “Resignation” (1.1%), “Recall” (1.1%), “Freeze” (1.3%) and “Planning” (2.7%).
The event type with the smallest probability value is “Freeze” (68.8%). The remaining events in the order of probability from small to large are: “Planning” (69%), “Illegal” (70.5%), “Recall” (75%), “Resignation” (75.9%), “Expiration” (78%), “order to correct” (78.9%), and “Restructuring” (79.7%).
6. Conclusions
Stock announcements contains much information about all aspects of a company, which are important for investors and stock forecasting. It is difficult to determine the event types from stock announcements. As there is no unified classification standard, existing studies have constructed various event type frameworks based on domain experts’ experience, Clustering, ontology and other methods. Some studies have resulted in a fine-grained classification framework. However, we believe that there is still room for improvement on the basis of the existing research (e.g., the abovementioned violations of laws and violations of company policy events). Based on different punishments and expectations for investors, we think that it is more reasonable to classify events into different types rather than into one type in the manner of the extant literature.
In order to obtain more detailed event types of stock announcement news, we proposed a two-step method. First, all verbs extracted from the announcement news are used as candidate event triggers. Due to the common expressions in Chinese announcement news, if there is an event type, it usually has a conventional expression form. On the contrary, if a candidate event trigger word (verb) does not suggest an event type, the expression of the news containing the verb is chaotic. Therefore, we combine co-occurrence words with the candidate event trigger words and express them in an ordered sequence of words. Then, we use three proposed criteria to determine the final event types.
Based on real data on the Chinese stock market, we finally constructed 54 event types from the announcement news. The verification results of the constructed event types (p = 0.927, f = 0.946) show that it is reasonable and consistent with people’s cognition. Further, we compare our work with other similar studies (summarized in Table 3). First of all, most of the existing studies focus on the event types in the financial news, and only regard the stock announcement news as part of this greater whole. Therefore, the event type frameworks built are usually rough. Then, we compared our results with those of [18,19,21], which also constructed fine-grained event types from stock announcement news. The p value of our work is lower than those of [18] (0.95) and [21] (0.967) but higher than that of [19] (0.673). The F value of our work is higher than those of [18] (0.79) and [19] (0.6). From the results of the constructed event types, our method has found some reasonable and valuable event types that have not been discussed yet. For example, an event type named “throughput” is constructed in this paper. To the best of our knowledge, this is the first of its kind, and only one similar event type called “performance change” can be found in the existing research. In the Chinese stock market, companies usually release quarterly, semi-annual or annual performance change news, so this method cannot reflect short-term changes. “Throughput” events are released by airline or port sector stocks. Unlike in other stock sectors, a “throughput” event is usually the main business. For example, CMB Shekou’s (SZ001872) port business accounted for 95.76% of its revenue in 2021. “Throughput” events can reflect the short-term performance changes of these companies and are valuable for investors.
In conclusion, our research on event extraction from stock announcements has enriched the existing literature, so it is of value and significance.
After constructing a fine-grained announcement news event type framework, we did not choose to conduct stock prediction work in a traditional way since we believe that it is inappropriate to consider announcements without other types of news, such as industry news and macroeconomic news (as has been proven in the literature). Instead, based on the unilateral trading policy of the Chinese stock market, we screen out some event types that are not valuable to investors according to their performance under the best-case scenario. We did not carry out a precise calculation here but consider the event performance under special circumstances.
Formal analysis, F.M.; investigation, P.W.; methodology, F.M., Y.X. and W.L.; software, F.M. and H.J.; supervision, F.M.; writing—original draft, F.M., P.W., Y.X., H.J. and W.L.; writing—review and editing, F.M., Y.X., P.W., H.J. and W.L. All authors have read and agreed to the published version of the manuscript.
The data presented in this study are available on request from the corresponding author.
The authors declare no conflict of interest.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
“投产/Put into production” event.
Event Type: | “投产/Put into Production” |
---|---|
Event trigger word: | 投产/Put into production |
Words matching list extracted by the algorithm | 期/Phase (0.33)-项目/Project (0.75)-建成/completed (0.23)-投产/put into operation |
Event identification template: | […]投产/Put into production […] |
Event example: | (SZ000952): The VB2 production line of the industrial park will be officially put into production, and the performance is expected to achieve restorative growth. |
Experimental results of event types.
ID | Event Type | p | R | F |
---|---|---|---|---|
1 | “垃圾焚烧”事件/garbage burn | 0.977 | 0.95 | 0.963 |
2 | “增资扩股”事件/Capital increase and share expansion | 0.903 | 0.920 | 0.912 |
3 | “业绩预告”事件/Performance forecast | 0.910 | 0.947 | 0.928 |
4 | “责令改正”事件/Order to correct | 0.936 | 1.000 | 0.967 |
5 | “权益分派”事件/Equity distribution | 1.000 | 0.952 | 0.975 |
6 | “股票解禁”事件/lifting the ban on stocks | 0.901 | 1.000 | 0.948 |
7 | “到期失效”事件/Expiration | 1.000 | 0.988 | 0.994 |
8 | “不确定性”事件/Uncertain | 0.957 | 0.989 | 0.972 |
9 | “届满”事件/Expiration | 0.879 | 0.944 | 0.911 |
10 | “可转换债券”事件/Convertible bond | 0.925 | 0.966 | 0.945 |
11 | “补助”事件/Subsidy | 0.935 | 0.974 | 0.955 |
12 | “犯罪”事件/Crime | 0.917 | 0.927 | 0.922 |
13 | “辞职”事件/Resignation | 0.962 | 0.989 | 0.975 |
14 | “一致性评价”事件/Consistency evaluation | 0.871 | 1.000 | 0.931 |
15 | “侦查”事件/Investigation incident | 1.000 | 0.933 | 0.966 |
16 | “违纪”事件/Violation of discipline | 0.897 | 0.977 | 0.935 |
17 | “行政处罚”事件/Administrative punishment | 0.946 | 0.891 | 0.918 |
18 | “拨付款”事件/Payment allocation | 0.879 | 1.000 | 0.935 |
19 | “投产”事件/Put into production | 0.871 | 0.989 | 0.926 |
20 | “拘留”事件/Detention | 1.000 | 1.000 | 1.000 |
21 | “盈利”事件/Profit | 0.743 | 0.909 | 0.818 |
22 | “预增”事件/Pre increase | 0.978 | 0.940 | 0.959 |
23 | “改制”事件/Restructuring | 1.000 | 1.000 | 1.000 |
24 | “减值”事件/Devaluation | 0.752 | 0.989 | 0.854 |
25 | “减持”事件/Reduction | 0.968 | 0.968 | 0.968 |
26 | “建成”事件/Completion | 0.853 | 1.000 | 0.921 |
27 | “清仓”事件/Clearance | 0.849 | 0.939 | 0.892 |
28 | “吞吐量”事件/Throughput | 1.000 | 1.000 | 1.000 |
29 | “预中标”事件/Pre bid winning | 1.000 | 1.000 | 1.000 |
30 | “转增股”事件/Conversion to share capital | 0.827 | 0.990 | 0.901 |
31 | “中标”事件/Winning the bid | 1.000 | 1.000 | 1.000 |
32 | “吸收合并”事件/Absorb merge | 0.957 | 0.937 | 0.947 |
33 | “扩建”事件/Expansion | 0.882 | 0.978 | 0.927 |
34 | “诉讼”事件/Litigation | 0.957 | 1.000 | 0.978 |
35 | “发起设立”事件/Initiate establishment | 0.875 | 0.893 | 0.884 |
36 | “投建”事件/Investment and construction | 0.978 | 0.989 | 0.984 |
37 | “罢免”事件/Recall | 0.967 | 1.000 | 0.983 |
38 | “药品临床”事件/Drug clinical | 0.817 | 1.000 | 0.899 |
39 | “筹划”事件/Planning | 0.759 | 0.908 | 0.827 |
40 | “并购”事件/Merger and acquisition | 0.925 | 0.976 | 0.950 |
41 | “转让”事件/Transfer | 0.829 | 0.823 | 0.826 |
42 | “净利”事件/Net profit | 1.000 | 0.979 | 0.989 |
43 | “补贴”事件/Subsidy | 0.913 | 1.000 | 0.955 |
44 | “收购”事件/Acquisition | 0.968 | 0.958 | 0.963 |
45 | “增持”事件/Overweight | 0.989 | 0.924 | 0.956 |
46 | “质押”事件/Pledge | 0.989 | 0.969 | 0.979 |
47 | “罚款”事件/Fine | 0.975 | 1.000 | 0.988 |
48 | “违法”事件/Illegal | 0.914 | 1.000 | 0.955 |
49 | “冻结”事件/Freeze | 1.000 | 1.000 | 1.000 |
50 | “签署签订”事件/Signing | 0.796 | 1.000 | 0.886 |
51 | “回购”事件/Repurchase | 0.978 | 0.989 | 0.984 |
52 | “出售”事件/Sale | 1.000 | 0.990 | 0.995 |
53 | “设立公司”事件/Establishment of company | 0.925 | 0.943 | 0.934 |
54 | “股票激励”事件/Stock incentive | 0.968 | 0.949 | 0.959 |
Total | 0.927 | 0.969 | 0.946 |
The results of related studies.
Source | Method | Event Type Framework |
---|---|---|
ACE event |
full-manual | 1 type: Business |
The Stock Sonar project [ |
semi-manual | 8 types: Legal, Analyst Recommendation, Financial, Stock Price Change, Deals, Mergers and Acquisitions, Partnerships, Product and Employment |
BEECON [ |
semi-manual | 11 types: Analyst Event, Bankruptcy, Company Basic Information Change, Company Collaboration, Company Growth, Product Event, etc. |
He [ |
semi-manual | 3 types: Financial Policy Events, Monetary Policy Events and Market Rule Adjustment |
Wang [ |
semi-manual | 2 types: Macro-Events, Individual Stock Event |
Chen [ |
full-manual | 8 types: Major contracts, Raw Materials, Major Conferences, Company Financial Statements, Major Policies, Mergers and Acquisitions, Personnel Changes and Additional Allotments |
Han et al. [ |
full-manual | 8 types: Product Transformation, Equity Change, Share Price Movement, Personnel Changes, Financial Status, etc. |
Zhang [ |
full-manual | 12 types: Major Events, Major Risks, Shareholding Changes, Capital Changes, Emergencies, Special Treatment, etc. |
Boudoukh et al. [ |
full-manual | 18 types: Business Trend, Deal, Employment, Financial, Mergers and Acquisitions, Earnings Factors, Ratings, Legal, Product, Investment, etc. |
Wu [ |
semi-manual | 13 types: Issuance, Dividend, Event Prompt, Pledge, Performance Notice, Suspension and Resumption of Trading, Fund-Raising, etc. |
Zhou [ |
full-manual | 4 types: Share Change, Debt, Market Transaction, Enterprise Change |
Investment return for some event types.
Event Type | Selling Time t | Purchase Price: Opening Price | Purchase Price: Highest Price | Purchase Price: Closing Price | Sample |
||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Probability of Positive Return | Average Return | Variance | Probability of Positive Return | Average Return | Variance | Probability of Positive Return | Average Return | Variance | |||
Capital increase and share expansion | 2 | 69.4% | 3.3% | 0.4% | 42.2% | 0.3% | 0.3% | 78.2% | 2.6% | 0.2% | 147 |
3 | 61.2% | 3.2% | 1.1% | 42.2% | 0.2% | 0.9% | 63.3% | 2.5% | 0.7% | 147 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day. It has a probability of 78.2% and can obtain a positive return with an average of 2.6%. | |||||||||||
Expiration | 2 | 77.1% | 3.1% | 0.1% | 62.9% | 0.7% | 0.1% | 91.4% | 2.5% | 0.1% | 35 |
3 | 91.4% | 3.4% | 0.1% | 77.1% | 1.0% | 0.1% | 91.4% | 2.8% | 0.1% | 35 | |
The experimental results show that the best investment scheme for such events is to buy at the opening price and sell on the third day, with a probability of 91.4% and a positive return with an average of 3.4%; the second best investment scheme is to buy at the closing price and sell on the third day, with a probability of 91.4% and a positive return with an average of 2.8% or to sell on the second day with a probability of 91.4% and a positive return with an average of 2.5%. | |||||||||||
Restructuring | 2 | 71.6% | 1.0% | 0.3% | 33.8% | −1.4% | 0.2% | 79.7% | 0.8% | 0.1% | 74 |
3 | 58.1% | −0.1% | 0.6% | 35.1% | −2.5% | 0.4% | 51.4% | −0.4% | 0.3% | 74 | |
It can be seen from the experimental results that the positive average return of this kind of event sample is low. Therefore, such events have no investment value. | |||||||||||
Throughput | 2 | 68.8% | 1.5% | 0.1% | 42.5% | 0.1% | 0.1% | 83.8% | 1.4% | 0.0% | 80 |
3 | 60.0% | 1.5% | 0.2% | 42.5% | 0.2% | 0.2% | 70.0% | 1.4% | 0.1% | 80 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 83.8% and a positive return with an average of 1.4%. | |||||||||||
Conversion to share capital | 2 | 64.4% | 2.3% | 0.5% | 42.5% | 0.1% | 0.4% | 78.3% | 2.9% | 0.3% | 811 |
3 | 59.4% | 2.5% | 1.0% | 45.4% | 0.3% | 1.0% | 67.2% | 3.0% | 0.8% | 811 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day. It has a probability of 78.3% and can obtain a positive return with an average of 2.9%. | |||||||||||
Winning the bid | 2 | 66.7% | 1.5% | 0.2% | 36.3% | −0.5% | 0.1% | 79.6% | 1.5% | 0.1% | 1990 |
3 | 59.6% | 1.3% | 0.4% | 37.3% | −0.7% | 0.3% | 64.2% | 1.3% | 0.3% | 1990 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day. It has a probability of 79.6% and can obtain a positive return with an average of 1.5%. | |||||||||||
Subsidy | 2 | 73.8% | 2.3% | 0.2% | 42.1% | 0.2% | 0.1% | 82.2% | 1.9% | 0.1% | 107 |
3 | 67.3% | 2.3% | 0.3% | 40.2% | 0.2% | 0.2% | 70.1% | 2.0% | 0.2% | 107 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 82.2% and a positive return with an average of 1.9%. | |||||||||||
Acquisition | 2 | 64.5% | 2.3% | 0.5% | 42.0% | 0.0% | 0.4% | 73.8% | 2.3% | 0.3% | 3555 |
3 | 58.6% | 2.3% | 1.1% | 42.0% | 0.0% | 1.0% | 60.0% | 2.3% | 0.9% | 3555 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 73.8% and a positive return with an average of 2.3%. | |||||||||||
Overweight | 2 | 72.8% | 3.2% | 0.4% | 43.1% | 0.2% | 0.2% | 81.7% | 2.6% | 0.2% | 3268 |
3 | 66.7% | 3.3% | 0.7% | 43.5% | 0.3% | 0.5% | 67.1% | 2.7% | 0.5% | 3268 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 81.7% and a positive return with an average of 2.6%. | |||||||||||
Illegal | 2 | 65.2% | 1.5% | 0.4% | 35.0% | −1.2% | 0.2% | 70.5% | 0.8% | 0.2% | 397 |
3 | 60.5% | 1.5% | 0.8% | 39.3% | −1.2% | 0.5% | 63.2% | 0.8% | 0.5% | 397 | |
It can be seen from the experimental results that although the probability of obtaining a positive return is 70.5%, the average positive return is small. Therefore, on the whole, such events are not good for investment. | |||||||||||
Signing | 2 | 65.7% | 2.1% | 0.3% | 39.2% | −0.2% | 0.2% | 77.4% | 2.1% | 0.1% | 1809 |
3 | 58.5% | 1.9% | 0.6% | 40.3% | −0.5% | 0.6% | 64.0% | 1.8% | 0.5% | 1809 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 77.4% and a positive return with an average of 2.1%. | |||||||||||
Stock incentive | 2 | 72.1% | 2.9% | 0.3% | 40.4% | 0.0% | 0.2% | 81.9% | 2.3% | 0.1% | 408 |
3 | 66.4% | 2.8% | 0.7% | 42.9% | −0.2% | 0.5% | 67.4% | 2.1% | 0.5% | 408 | |
The experimental results show that the best investment scheme for such events is to buy at the closing price and sell on the second day, with a probability of 81.9% and a positive return with an average of 2.3% or buy at the opening price and sell on the second day, with a probability of 72.1% and a positive return with an average of 2.9%. |
References
1. Yi, Z. Research on Deep Learning Based Event-Driven Stock Prediction. Ph.D. Thesis; Harbin Institute of Technology: Harbin, China, 2019.
2. Yang, H.; Chen, Y.; Liu, K.; Xiao, Y.; Zhao, J. DCFEE: A document-level Chinese financial event extraction system based on automatically labeled training data. Proceedings of the ACL 2018, System Demonstrations; Melbourne, Australia, 15–20 July 2018; pp. 50-55.
3. Linghao, W.U. Empirical Analysis of the Impact of Stock Events on Abnormal Volatility of Stock Prices. Master’s Thesis; Huazhong University of Science and Technology: Wuhan, China, 2019.
4. Balali, A.; Asadpour, M.; Jafari, S.H. COfEE: A Comprehensive Ontology for Event Extraction from text. arXiv; 2021; arXiv: 2107.10326[DOI: https://dx.doi.org/10.2139/ssrn.4117538]
5. Guda, V.; Sanampudi, S.K. Rules based event extraction from natural language text. Proceedings of the 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT); Bangalore, India, 20–21 May 2016; pp. 9-13.
6. Ritter, A.; Etzioni, O.; Clark, S. Open domain event extraction from twitter. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Beijing, China, 12–16 August 2012; pp. 1104-1112.
7. Fung GP, C.; Yu, J.X.; Lam, W. Stock prediction: Integrating text mining approach using real-time news. Proceedings of the 2003 IEEE International Conference on Computational Intelligence for Financial Engineering; Hong Kong, China, 20–23 March 2003; pp. 395-402.
8. Wong, K.F.; Xia, Y.; Xu, R.; Wu, M.; Li, W. Pattern-based opinion mining for stock market trend prediction. Int. J. Comput. Processing Lang.; 2008; 21, pp. 347-361. [DOI: https://dx.doi.org/10.1142/S1793840608001949]
9. Du, M.; Pivovarova, L.; Yangarber, R. PULS: Natural language processing for business intelligence. Proceedings of the 2016 Workshop on Human Language Technology; New York, NY, USA, 9–15 July 2016; pp. 1-8.
10. Chen, C.; Ng, V. Joint modeling for Chinese event extraction with rich linguistic features. Proceedings of the COLING 2012; Mumbai, India, 8–15 December 2012; pp. 529-544.
11. Liu, L. Heterogeneous Information Based Financial Event Detection. Ph.D. Thesis; Harbin Institute of Technology: Harbin, China, 2010.
12. Feldman, R.; Rosenfeld, B.; Bar-Haim, R.; Fresko, M. The stock sonar—sentiment analysis of stocks based on a hybrid approach. Proceedings of the AAAI Conference on Artificial Intelligence; San Francisco, CA, USA, 7–11 August 2011; pp. 1642-1647.
13. He, Y. Research on Ontology-Based Case Base Building and Reasoning for Theme Events in the Stock Markets. Master’s Thesis; Hefei University of Technology: Hefei, China, 2017.
14. Wang, Y. Research on Financial Events Detection by Incorporating Text and Time-Series Data. Master’s Thesis; Harbin Institute of Technology: Harbin, China, 2015.
15. Chen, H. Research and Application of Event Extraction Technology in Financial Field. Ph.D. Thesis; Beijing Institute of Technology: Beijing, China, 2017.
16. Han, S.; Hao, X.; Huang, H. An event-extraction approach for business analysis from online Chinese news. Electron. Commer. Res. Appl.; 2018; 28, pp. 244-260. [DOI: https://dx.doi.org/10.1016/j.elerap.2018.02.006]
17. Boudoukh, J.; Feldman, R.; Kogan, S.; Richardson, M. Information, trading, and volatility: Evidence from firm-specific news. Rev. Financ. Stud.; 2019; 32, pp. 992-1033. [DOI: https://dx.doi.org/10.1093/rfs/hhy083]
18. Arendarenko, E.; Kakkonen, T. Ontology-based information and event extraction for business intelligence. Proceedings of the 2012 International Conference on Artificial Intelligence: Methodology, Systems, and Applications; Varna, Bulgaria, 13–15 September 2012; pp. 89-102.
19. Zhang, W. Research on key technologies of event-driven stock market prediction. Ph.D. Thesis; Harbin Institute of Technology: Harbin, China, 2018.
20. Turchi, M.; Zavarella, V.; Tanev, H. Pattern learning for event extraction using monolingual statistical machine translation. Proceedings of the International Conference Recent Advances in Natural Language Processing 2011; Hissar, Bulgaria, 10–16 September 2011; pp. 371-377.
21. Zhou, X. Research on Financial Event Extraction Technology Based on Deep Learning. Ph.D. Thesis; University of Electronic Science and Technology of China: Chengdu, China, 2020.
22. Wang, Y.; Luo, S.; Hu, Z.; Han, M. A Study of event elements extraction on Chinese bond news texts. Proceedings of the 2018 IEEE International Conference on Progress in Informatics and Computing (PIC); Suzhou, China, 14–16 December 2018; pp. 420-424.
23. Ding, P.; Zhuoqian, L.; Yuan, D. Textual information extraction model of financial reports. Proceedings of the 2019 7th International Conference on Information Technology: IoT and Smart City; Shanghai, China, 20–23 December 2019; pp. 404-408.
24. Wang, A. Study on the Impact of Change on Interest Rate on Real Estate Listed Companies Stock Price. Master’s thesis; Southwestern University of Finance and Economics: Chengdu, China, 2012.
25. Jinmei Zhao Yu Shen Fengyun, W.U. Natural disasters, man-made disasters and stock prices: A study based on earthquakes and mass riots. J. Manag. Sci. China; 2014; 17, pp. 19-33.
26. Yi, Z.; Lu, H.; Pan, B. The impact of Sino US trade war on China’s stock market—An analysis based on event study method. J. Manag.; 2020; 33, pp. 18-28.
27. Li, T.; Qian, Z.; Deng, W.; Zhang, D.; Lu, H.; Wang, S. Forecasting crude oil prices based on variational mode decomposition and random sparse Bayesian learning. Appl. Soft Comput.; 2021; 113, 108032. [DOI: https://dx.doi.org/10.1016/j.asoc.2021.108032]
28. Brandt, M.; Gao, L. Macro fundamentals or geopolitical events? A textual analysis of news events for crude oil. J. Empir. Financ.; 2019; 51, pp. 64-94. [DOI: https://dx.doi.org/10.1016/j.jempfin.2019.01.007]
29. Li, T.; Shi, J.; Deng, W.; Hu, Z. Pyramid particle swarm optimization with novel strategies of competition and cooperation. Appl. Soft Comput.; 2022; 121, 108731. [DOI: https://dx.doi.org/10.1016/j.asoc.2022.108731]
30. Deng, W.; Shang, S.; Cai, X.; Zhao, H.; Zhou, Y.; Chen, H.; Deng, W. Quantum differential evolution with cooperative coevolution framework and hybrid mutation strategy for large scale optimization. Knowl.-Based Syst.; 2021; 224, 107080. [DOI: https://dx.doi.org/10.1016/j.knosys.2021.107080]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Determining the event type is one of the main tasks of event extraction (EE). The announcement news released by listed companies contains a wide range of information, and it is a challenge to determine the event types. Some fine-grained event type frameworks have been built from financial news or stock announcement news by domain experts manually or by clustering, ontology or other methods. However, we think there are still some improvements to be made based on the existing results. For example, a legal category has been created in previous studies, which considers violations of company rules and violations of the law the same thing. However, the penalties they face and the expectations they bring to investors are different, so it is more reasonable to consider them different types. In order to more finely classify the event type of stock announcement news, this paper proposes a two-step method. First, the candidate event trigger words and co-occurrence words satisfying the support value are extracted, and they are arranged in the order of common expressions through the algorithm. Then, the final event types are determined using three proposed criteria. Based on the real data of the Chinese stock market, this paper constructs 54 event types (p = 0.927, f = 0.946), and some reasonable and valuable types have not been discussed in previous studies. Finally, based on the unilateral trading policy of the Chinese stock market, we screened out some event types that may not be valuable to investors.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details

1 School of Artificial Intelligence and Law, Southwest University of Political Science & Law, Chongqing 401120, China;
2 School of Economic Information Engineering, Southwestern University of Finance and Economics, Chengdu 611130, China;
3 School of Economics, Xihua University, Chengdu 610039, China;