Efficient Detection of Large-Scale Multimedia

Full text

Turn on search term navigation

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

With the continuous development of multimedia social networks, online public opinion information has become increasingly popular. The dissemination of information has become a new trend with the rapid increase of hot topics among the general public, such as Baidu Hot Search and Weibo Hot Search [1–3]. As we all know, this kind of multimedia network is a double-edged sword. It can convey both positive and negative information. It can guide the public to judge something and also affect the harmonious and stable development of society seriously. Therefore, how to effectively detect such abnormal network information has become an urgent problem to be solved. There have been many explorations in the industry, such as the use of the KNN method for network abnormal information data detection, the confidence method for training, and the comparison of abnormal samples to detect abnormal information data in the network. However, there are certain limitations in terms of this method, and the missed detection rate is relatively high [4, 5]. Therefore, this is still one of the important research directions, and a variety of detection operators have appeared one after another, such as Forstner operator, SUSAN operator, Harris operator, and so on. In addition, some scholars used singular value decomposition methods for analysis, but they produced visual and auditory errors, and the effect is not obvious [6].

The rule-extracting matrix algorithm extracts implicit and useful related knowledge and rules, such as corresponding rules, knowledge, and so on, from the data warehouse. However, due to different complexities of the data, it is difficult to ensure complete mining of the data [7, 8]. The detection of network format data is realized through filtering, and the detection of network information data anomaly is realized through probability calculation based on the rule-extracting matrix algorithm. The value of the prior probability is calculated in an attempt, and the network data abnormality is detected, aiming to explore the detection of abnormal network data performance.

2. Rule-Extracting Matrix Algorithm

In multimedia network information detection, the most obvious distinction is words with emotional color. Therefore, it is effective to identify, extract, and detect network abnormal information based on emotional words. Therefore, during the whole process, firstly the multimedia network information is grabbed, secondly, the corresponding data samples are analyzed and processed through Chinese word segmentation technology, and finally, the word segmentation results are matched with the existing abnormal information database, and abnormal data are detected in the multimedia network according to the corresponding abnormal information threshold [9]. $\begin{matrix} (1) & A t = \sqrt{e} \cdot B^{H} t \cdot x_{n}, \\ (2) & H_{B} z = \frac{1 + \sin ϑ_{2}}{\cos ϑ_{2}} \cdot \frac{\cos ϑ_{1} \cdot z^{- 1}}{\sin ϑ_{2}} \cdot G z \cdot A t, \end{matrix}$ where A(t) represents the network information data traffic oscillation attenuation estimated value obtained after interference filtering and represents the variance coefficient in multimedia network data, and $B^{H} t$ represents the negative information feature detection-related function in the network information data traffic.

According to the abnormal information data provided by the multimedia network information database as a benchmark, the normal information and abnormal information in the subdatabase are separately counted, and the frequency of normal information and abnormal information of a single word is calculated. The specific calculation formulas are shown in formulas (3) and (4). $\begin{matrix} (3) & A_{c_{i}} = \frac{f P_{c_{i}}}{\sum_{m} f P_{c_{i}}}, \\ (4) & B_{c_{i}} = \frac{f N_{c_{i}}}{\sum_{n} f N_{c_{i}}}, \end{matrix}$ where $w_{1}$ is the data weight of the character or word segmentation; $w_{2}$ is the normal word frequency of $w_{3}$ ; $B = b i, j$ is the weight of the word segmentation in the abnormal data; and $1 \leq i, j \leq m$ is the frequency of the abnormal words of $S_{i, j}$ . In addition, m and n are the corresponding numbers of different characters in the normal information database and the abnormal information database.

From formulas (3) and (4), it can be seen that the abnormal information data in the word segmentation and word database are relatively similar, and the specific calculation formulas are shown in formulas (5) and (6): $\begin{matrix} (5) & G z = \frac{1 - \sin ϑ_{2}}{2} \cdot \frac{1 - z^{- 1}}{\sin ϑ_{1} 1 + \sin ϑ_{2}}, \\ (6) & S_{c_{i}} = A_{c_{i}} - B_{c_{i}}, \end{matrix}$ where $A a_{i, j}$ is the threshold for measuring whether the word is normal or abnormal information. Assuming $S_{c_{i}} \geq 0$ is the abnormal frequency corresponding to the correct information data, $S_{c_{i}} < 0$ represents the abnormal frequency in the network information data.

According to formula (3), the threshold of normal and abnormal information is calculated, and the emotional tendency of words can be expressed by the following formula: $\begin{matrix} (7) & S_{w} = \frac{1}{n} \cdot \sum S_{c_{i}} . \end{matrix}$

Among them, the calculated result is the detection result of the abnormal information data of the multimedia network.

During the process of detecting abnormal data in multimedia networks, in addition to a lot of effective information, there are also many kinds of redundant interference information. These kinds of interference information will affect the information detection results of abnormal data and cause a certain missed detection rate [10–14].

3. Model Construction and Problem Description

3.1. Large-Scale Visual Multimedia Network Model

As we all know, the current multimedia network model uses a topological network to serve the Internet for service integration. This text builds the network structure of the large-scale visual multimedia network simulation based on the computer graphics and image processing by constructing the multimedia visual network model, as shown in Figure 1.

[figure omitted; refer to PDF]

As shown in Figure 1, the request for each multimedia storing at the same time may be different, but only one storage can be performed. Set |LFN| to the size of the data file request, as shown in the following formula: $\begin{matrix} (8) & {RC}_{r} = β \times \sum LFN, \end{matrix}$ where β is a constant used to express the degree of parallelism and the copy of the abnormal data of the multimedia network can be calculated by formula (2), as shown in the following formula: $\begin{matrix} (9) & RV = w_{1} \times \frac{1}{S} + w_{2} \times NA + w_{3} \times \frac{1}{CT - LA}, \end{matrix}$ where the values of $w_{1}$ , $w_{2}$ , and $w_{3}$ can be set by themselves and K is the output set of the multiplexer, and the details are shown in the following formula: $\begin{matrix} (10) & k = 1,2,3, \dots K . \end{matrix}$

For large-scale visual multimedia network topology network, there are abnormal data mutation data, such as error codes, and there are mutation abnormal data in custom data [15]. For the inspection of the abnormal data of the large-scale visual multimedia network, the abnormal information of the large-scale visual multimedia network is required, as shown in formulas (11) and (12): $\begin{matrix} (11) & T = T_{1} + T_{2}, \\ (12) & F x, y = w_{1} \times T \times w_{2} \times D x, y . \end{matrix}$

This function can be tailored because it is defined as a weighted combination of three indicators. Predict the parallel computing time required for calculation of the same chain length on different numbers of cores and obtain the numerical simulation of the CPU running time of the attack data captured by web firewalls of different chain lengths. This realizes the construction of a large-scale visual multimedia network model to provide a model basis for efficient detection of big data.

3.2. Feature Extraction of Big Data in Large-Scale Visual Multimedia Network

If the watermark image in the large-scale visual multimedia network information is represented by $B = b i, j$ , $1 \leq i, j \leq m$ , and m is the dimension of the matrix, the singular value is obtained as shown in formula (10) by solving the singular value of the matrix: $\begin{matrix} (13) & A a_{i, j} = U_{i, j}^{*} S_{i, j}^{*} V_{i, j} . \end{matrix}$

On the basis of formula (13), a small matrix unit $A a_{i, j}$ is obtained. According to a certain decomposition of singular values, a matrix of singular values is obtained, as shown in the following formula: $\begin{matrix} (14) & A a_{i, j} = U_{i, j}^{*} S_{i, j}^{*} V_{i, j}^{T} . \end{matrix}$

The undirected graph analysis method is used to design the network model. The finite queuing model of each channel NAV task is used to represent the state space, as shown in the following formula: $\begin{matrix} (15) & S = k, n, 0 \leq k \leq K, 0 \leq n \leq N . \end{matrix}$

Suppose that in the multimedia network database, the predictive performance of data packet information is $λ_{i} : 1 \leq i \leq S$ , and the criterion is $R_{j} : 1 \leq j \leq L$ . Among them, S represents the amount of input data and L represents the total bandwidth ratio used in the channel. The channel allocation packet conversion waiting time is $\begin{matrix} (16) & W_{q} = W - \bar{X} = \frac{1}{γ} \sum_{k = 1}^{K} \sum_{n = 1}^{N} k p_{k, n} - \frac{N - 1 μ + r}{μ r} . \end{matrix}$

In the D-dimensional search space established during the mobile learning process, the population size is set to N, the current position of the particle is X_i = (x_i1, x_i2, …, x_iD), the speed is V_i = ( $v$ _i1, $v$ _i2, …, $v$ _iD), the power consumption of the data stream is C received from all adjacent nodes, and the accurate prediction of the results is $B^{H} t$ , and the semisupervised control prediction result of mobile learning is obtained as $\begin{matrix} (17) & \hat{q} = arc \min_{q = 1} \sum_{q = 1} q q | e - \sum_{q = \bar{q}}^{q} q q | e . \end{matrix}$

In order to realize learning from other particles in the evolution, the linear superposition detail capture algorithm is used to perform the optimal configuration of the search ability balance state, and the objective function of supervised learning is obtained as $\begin{matrix} (18) & L \bar{q}, q = \begin{cases} 1 \bar{q} - q > \frac{δ}{2}, \\ 0 \bar{q} - q \leq \frac{δ}{2} . \end{cases} \end{matrix}$

In the formula, $B^{H} t$ refers to the related parameters of the scheduling task instruction in the M-learning process, and the computer vision features can be effectively extracted, which can be explained by the following formula: $\begin{matrix} (19) & D C_{1}, C_{2} = \begin{cases} true, if D i f C_{1}, C_{2} > MInt C_{1}, C_{2}, \\ false, otherwise, \end{cases} \\ (20) & MInt C_{1}, C_{2} = \min Int C_{1} + τ C_{1}, \\ Int C_{2} + τ C_{2}, \end{matrix}$ where the Canny edge detection function $H_{B} z$ is a constant and |C| is the size of the area C, which realizes the feature extraction of big data in a large-scale visual multimedia network.

The stability of multimedia networks and the quality of video and audio transmission are improved through the efficient detection of abnormal data. According to the above description, the implementation process of the improved algorithm is described as shown in Figure 2.

[figure omitted; refer to PDF]

Among them, the sparse iterative expression of the filter can be expressed by the following formula: $\begin{matrix} (22) & ϑ_{1} k + 1 = ϑ_{2} k + 1 = ϑ k - y k \cdot G z . \end{matrix}$

According to FIR anti-interference filtering, the flow output of multimedia network information data center can be expressed by the following formula: $\begin{matrix} (23) & z t = x t + y t = a t + ϑ_{1} k + 1 + n t . \end{matrix}$

4.2. Anomaly Information Data Detection Method Based on Rule-Extracting Matrix Algorithm

Based on the various public opinion topics of the sample, the corresponding probability expression is built, and the record to be detected can be expressed by the following formula: $\begin{matrix} (24) & P Category | z t = \frac{P Category \cdot P z t | Category}{P z t}, \end{matrix}$ where P(Category|z(t)) is used to represent a certain probability to be detected. For each data record to be detected, the data vector can be converted into a content vector. $\begin{matrix} (25) & P Category | z t = \frac{P Category \cdot P x_{i 1}, \dots, x_{i n}}{P x_{i 1}, \dots, x_{i n}}, \end{matrix}$ where $P x_{i 1}, \dots, x_{i n}$ is a constant sequence. According to the quantitative calculation of formula (25), the corresponding objective function $F_{n}$ can be constructed, as shown in the following formula: $\begin{matrix} (26) & F_{n} = \arg \max P Category | z t = \arg \max P Category \cdot P x_{i 1}, \dots x_{i n} . \end{matrix}$

Based on the assumption of the rule-extracting matrix algorithm, each record has an independent relationship between the component vector values, and the joint probability is the product of the component probabilities. In summary, using the maximum value formula, the final objective function expression constructed is $\begin{matrix} (27) & F_{n} = \arg \max P Category \prod P x_{i n} | Category . \end{matrix}$

5. Experimental Results and Analysis

In order to effectively detect and simulate the abnormal data of the multimedia network, the data required for the experiment are collected through the collection of the corresponding 5000 pieces of information through the hot search on Weibo (October-November 19) for 2 months. The content involves current affairs politics, sports content, science and technology, literary content, etc. These data are vectorized, including normal information and abnormal information at the same time, to form the final experimental dataset.

(1) The false positive rate of rule-extracting matrix algorithm is tested.

(2) The missed detection rate of the rule-extracting matrix algorithm is tested.

The corresponding results are obtained through two simulation experiments, as follows.

5.1. Experiment 1

In Figure 4, the so-called false positive rate is the probability that normal data are judged as abnormal data. Through the analysis of the results, the false positive rates of the detection results of three different detection algorithms under the same number of experimental iterations are quite different. Among them, for the extraction based on rules under the matrix algorithm’s multimedia network negative information data detection method, the false positive rate of the detection result is overall low, with the highest one of not more than 5%, and for other information gain network anomaly detection methods, migration technology, and D-S theory data detection methods, the relative false positive rate is more than 25%.

[figures omitted; refer to PDF]

5.2. Experiment 2

In Figure 5, U represents the number of abnormal data in the experimental data subset; 0 represents the number of missed detections of network anomaly detection methods based on data mining and information extraction; and J represents the number of missed detections of the multimedia network negative information data detection method based on the rule-extracting matrix algorithm.

[figure omitted; refer to PDF]

Analyzing Figure 5, the missed detection rate of the network anomaly detection method based on data mining and information extraction is about 23.6%. The missed detection rate of the multimedia network negative information data detection method based on the rule-extracting matrix algorithm is about 0.67%. The comparison of the above data shows that the proposed method has a low probability of missed detection and is practical.

6. Conclusions

Multimedia network is a double-edged sword. It can convey both positive and negative information. It can guide the public to judge something and also greatly affect the harmonious and stable development of society. Data detection network is used to remove the interference of the network’s bad factors, which can improve the detection efficiency and reduce the missed detection rate. Experimental comparison shows that the rule matrix algorithm proposed in this paper is highly reliable and scientific.

References

[1] B. Hou, C. Hou, T. Zhou, Z. Cai, F. Liu, "Detection and characterization of network anomalies in large-scale RTT time series," IEEE Transactions on Network and Service Management, vol. 18 no. 1,DOI: 10.1109/tnsm.2021.3050495, 2021.

[2] C. Benhamed, S. Mekaoui, K. Ghoumid, "Large-scale ip network data analysis for anomalies detection thanks to svm," International Journal of Design & Nature and Ecodynamics, vol. 11 no. 3, pp. 376-386, DOI: 10.2495/dne-v11-n3-376-386, 2016.

[3] M. L. Shyu, Y. Yan, J. Chen, "Efficient large-scale stance detection in tweets," International Journal of Multimedia Data Engineering & Management, vol. 9 no. 3, 2018.

[4] A. Abrardo, M. Martalò, G. Ferrari, "Information fusion for efficient target detection in large-scale surveillance wireless sensor networks," Information Fusion, vol. 38, pp. 55-64, DOI: 10.1016/j.inffus.2017.02.002, 2017.

[5] M. Mardani, "Leveraging sparsity and low rank for large-scale networks and data science," Dissertations & Theses Gradworks, vol. 3 no. 2, 2015.

[6] D. C. V. F. Julio, J. Seixas, G. L. Miotto, "Detection of data taking anomalies for the ATLAS experiment," China Water & Wastewater, vol. 3 no. 1, pp. 42-50, 2015.

[7] S. Kanarachos, S.-R. G. Christopoulos, A. Chroneos, M. E. Fitzpatrick, "Detecting anomalies in time series data via a deep learning algorithm combining wavelets, neural networks and Hilbert transform," Expert Systems with Applications, vol. 85 no. 11, pp. 292-304, DOI: 10.1016/j.eswa.2017.04.028, 2017.

[8] S. Papadopoulos, A. Drosou, D. Tzovaras, "A novel graph-based descriptor for the detection of billing-related anomalies in cellular mobile networks," IEEE Transactions on Mobile Computing, vol. 15 no. 11, pp. 2655-2668, DOI: 10.1109/tmc.2016.2518668, 2016.

[9] V. A. Skazin, A. V. Pavlychev, S. S. Zotov, "Detection of network anomalies in log files using machine learning methods," IOP Conference Series: Materials Science and Engineering, vol. 1069 no. 1, pp. 120-126, DOI: 10.1088/1757-899x/1069/1/012021, 2021.

[10] Z. Tang, Z. Chen, Y. Bao, H. Li, "Convolutional neural network-based data anomaly detection method using multiple information for structural health monitoring," Structural Control and Health Monitoring, vol. 26 no. 1,DOI: 10.1002/stc.2296, 2019.

[11] H. Kasai, W. Kellerer, M. Kleinsteuber, "Network volume Anomaly detection and identification in large-scale networks based on online time-structured traffic tensor tracking," IEEE transactions on network and service management, vol. 13 no. 3, pp. 636-650, DOI: 10.1109/tnsm.2016.2598788, 2016.

[12] G. Al Naymat, M. Al Kasassbeh, E. Al Harwari, "Using machine learning methods for detecting network anomalies within SNMP-MIB dataset," International Journal of Wireless and Mobile Computing, vol. 15 no. 1, pp. 67-76, DOI: 10.1504/ijwmc.2018.10015860, 2018.

[13] D. Vallejo-Huanga, M. Ambuludi, P. Morillo, "Empirical exploration of machine learning techniques for detection of anomalies based on NIDS," IEEE Latin America Transactions, vol. 19 no. 5, pp. 772-779, DOI: 10.1109/tla.2021.9448311, 2021.

[14] N. R. Storozhenko, A. I. Goleva, D. A. Tunkov, V. I. Potapov, "Modern problems of information systems and data networks: choice of network equipment, monitoring and detecting deviations and faults," Journal of Physics: Conference Series, vol. 1546 no. 1, pp. 120-130, DOI: 10.1088/1742-6596/1546/1/012030, 2020.

[15] L. Dimitriou, C. Antoniou, "Monitoring social network formation and information content analysis of transport anomalies: the case of airline crashes," Journal of Air Transport Management, vol. 65 no. 10, pp. 127-141, DOI: 10.1016/j.jairtraman.2017.09.011, 2017.

Word count: 2580

Show less

Copyright © 2021 Jie Zhao. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Translate

With the continuous development of multimedia social networks, online public opinion information is becoming more and more popular. The rule extraction matrix algorithm can effectively improve the probability of information data to be tested. The network information data abnormality detection is realized through the probability calculation, and the prior probability is calculated, to realize the detection of abnormally high network data. Practical results show that the rule-extracting matrix algorithm can effectively control the false positive rate of sample data, the detection accuracy is improved, and it has efficient detection performance.

Details

Title

Efficient Detection of Large-Scale Multimedia Network Information Data Anomalies Based on the Rule-Extracting Matrix Algorithm

Author

Zhao, Jie¹

¹ College of Electronic Information Science, Fujian Jiangxia University, Fuzhou, Fujian 350108, China

Editor

Zhendong Mu

Publication year

2021

Publication date

2021

Publisher

John Wiley & Sons, Inc.

ISSN

16875680

e-ISSN

16875699

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2021/3299891

ProQuest document ID

2609153972

Efficient Detection of Large-Scale Multimedia Network Information Data Anomalies Based on the Rule-Extracting Matrix Algorithm

Jump to:

Full text

Abstract

Details

Suggested sources