Today, "Big Data" is a new information overloading problem in many different areas. Such areas include health cares (e.g., medical records, bioinformatics), e-sciences (e.g., physics, chemistry, and geology), and social sciences (e.g., politics) [Bizer et al. 2011, Jung 2009]. Thus, as we have various types of feasible data from a number of available sources, it is becoming increasingly more difficult to efficiently process such Big Data. Distributed computing technologies (e.g., Hadoop, Hive and Pig) are strongly related to the "Big Data" issues [Hogarth and Soyer 2015, Jung 2012]. Given a very large scale "Big Data," efficient distributed data processing and management remain a challenge in many research areas, for example, information acquisition and stream processing, as well as data integration [Madden 2012]. Also, the number of diverse information processing system architectures might be involved in these areas. They need to exploit relevant solutions to support a number of intelligent services (e.g., knowledge management and decision making). The aim of this special issue is to bring together researchers and practitioners in areas of distributed computing to share their visions, research achievements and solutions, to resolve the issues on big data processing and to establish worldwide cooperative research and development. This will give an opportunity to push further the discussion upon the potential of knowledge and semantic systems across many communities.

Page 754

This special issue is devoted to analysis of these "Big data" sources and what is more important to identify the areas where Big data can be applied and provide the knowledge that is not accessible for other types of analysis. Additionally, applications of Big data can be investigated either from static or dynamic perspective. We seek for business and industrial applications of Big data that help to solve real-world problems. The area of Big data analytics bring together researchers and practitioners from different fields and the main goal of this special issue is to provide for these people the opportunity to share their visions, research achievements and solutions as well as to establish worldwide cooperative research and development. At the same time, we want to provide a platform for discussing research topics underlying the concepts of Big data analytics and its applications by inviting members of different communities that share this common interest of investigating social networks.

The first paper in this issue, authored by Ana I. Torre-Bastida et al., proposes an interesting big data analytical functionalities for heterogeneous databases. Particularly, two complementary use cases have been presented to illustrate the potential of using the open data in the business domain. The first represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. The second focuses on the classification of organizations and enterprises aiming at detecting potential competitors and/or allies via the analysis of the conceptual similarity between their participated projects

The second paper authored by Paloma Cáceres et al. introduces big data processing scheme to understand public bus networks. The proposed process has studied modeling and linking accessibility data by using ontological knowledge.

In the third paper, Quang Dieu Tran and Jai E. Jung presents the software platform for discovering contents and stories in the movies. They claims that it is an important big data sources for digital cultural contents and understanding our society. The system automatically understand the movies by discovering social networks and measuring various social measurements.

The fourth paper by Zbyněk Falt et al. focuses on parallel data processing and parallel streaming systems for big data analytics. One of the key components of these systems is the task scheduler which plans and executes tasks spawned by the application on available CPU cores. The proposed task scheduler combined with the new memory allocator achieve up to speed up on a NUMA system and up to 10% speed up on an older SMP system with respect to the unoptimized versions of the scheduler and allocator.

In the fifth paper, Héctor Allende-Cid et al. focus on distributed regression problem. A new Distributed Regression System is presented, which makes use of a discrete representation of the probability density functions. Neighborhoods of similar datasets are detected by comparing their approximated pdfs. This information supports an ensemble-based approach, and the improvement of a second level unit, as it is the case in stacked generalization.

Page 755

The sixth paper by Alejandro Zambrano et al. introduces visualization algorithm that improves understandability of run-time production systems. The visualization system has been designed by the Set of Experience Knowledge Structure (SOEKS).

The seventh paper by Ngoc Tu Luong et al. proposes an efficient method to analyze a large scale publication data. Particularly, recommendation system based on the data is built to assist users for collaboration.

This special issue has been achieved by a number of fruitful collaborations. We would like to thank the editor in chief of Journal of Universal Computer Science (JUCS), Christian G"utl, for his kind support and help during the entire process of publication. The special issue has selected 7 high-quality papers out of 17 submissions (about 41% acceptance rate). This was possible thanks to the work of the renowned researchers that provided their anonymous reviews.

Finally, we are most grateful to the authors for their valuable contributions and for their willingness and efforts to improve their papers in accordance with the suggestions and comments from reviewers.

Jason J. Jung, David Camacho, and Costin Badica
(Seoul, Korea, May 4, 2015)

References

[Bizer et al. 2011] Bizer, C., Boncz, P., Brodie, M.L., Erling, O.: The Meaningful Use of Big Data: Four Perspectives - Four Challenges. SIGMOD Record, 40(4):56-60, 2011.

[Hogarth and Soyer 2015] Hogarth, R.M., Soyer, E.: Using Simulated Experience to Make Sense of Big Data. MIT Sloan Management Review, 56(2):49-54, 2015.

[Jung 2009] Jung, J.J.: Contextualized query sampling to discover semantic resource descriptions on the web. Information Processing & Management, 45(2):283-290, 2009.

[Jung 2012] Jung, J.J.: Evolutionary Approach for Semantic-based Query Sampling in Large-scale Information Sources. Information Sciences, 182(1):30-39, 2012.

[Madden 2012] Madden, S.: From Databases to Big Data. IEEE Internet Computing, 16(3):4-6, 2012.

Page 756