A Bibliometric Analysis and Review on Performance Modeling Literature

In management practice, performance indicators are considered as a prerequisite to make informed decisions in line with the organization’s goals. On the other hand, indicators summarizes compound phenomena in a few digits, which can induce to inadequate decisions, biased by information loss and conflicting values. Model driven approaches in enterprise engineering can be very effective to avoid these pitfalls, or to take it under control. For that reason, “performance modeling” has the numbers to play a primary role in the “model driven enterprise” scenario, together with process, information and other enterprise-related aspects. In this perspective, we propose a systematic review of the literature on performance modeling in order to retrieve, classify, and summarize existing research, identify the core authors and define areas and opportunities for future research.


Introduction
After more than one decade of hot debate between proponents and detractors of model driven approaches in enterprise engineering, the research field is quite active.A number of approaches are discussed in literature to model aspects like strategies, organization [1], [2], processes [3]- [5], resources, information systems [6] and other enterprise-related key aspects.In specific sectors (e.g. companies involved in software production in broad sense, such as automotive, banking, printing etc.), industry-wide studies on Model Driven Engineering revealed the increasing importance of conceptual models and Domain Specific Modeling Languages (DSML) to describe and solve domain-specific problems [7].
According to [8], the above mentioned modeling techniques can be classified in five main categories, namely (1) human sense-making and communication, (2) computer-assisted analysis, (3) business process management and quality assurance, (4) model deployment and activation, (5) to give context in the system development phase.All these categories share two transversal purposes: to cope with the complexities of real enterprises by means of schematic representations able to "capture" only the most relevant aspects avoiding the minor ones, and to measure or estimate some main characteristics related to the performance of enterprises.
Focusing on the latter purpose, according to Strecker et al. [1] it is crucial to observe that, in management practice, performance indicators are considered as a prerequisite to making informed decisions in line with the organization's goals.On the other hand, performance indicators summarize compound phenomena in a few values, which can drive to information loss and inadequate decisions [1].Moreover, intensions associated to different indicators can be in conflict and again this can lead management to wrong decisions.To face this ambivalence, some authors proposed to apply conceptual modeling to performance indicators, developing different modeling approaches based on domain specific modeling languages as well as on ontologies.The aim of these models is to define a shared knowledge on performance measurement, to support the model-driven development of Information Systems Components and, thus, to support firms in the decision-making processes.Although this is a novel field of research, there is an interest towards it from both the managerial perspective and the Information Systems (IS) one.Managerial use of indicators is motivated by several decades of research on performance-driven management and on indicators inter-relations.More recently also IS researchers started to discuss about the need for a "new generation of enterprise systems" [9] able to face the challenges brought from a collaboration-driven society.However, in order to solve these issues, the IS design should start with "understanding the business and organizational domain in which the information system is to be embedded" by means of conceptual models [10].
In this perspective, the aim of this work is to analyze the trend of this topic, to provide a systematic literature review on performance modeling and to classify these articles by means of a comparison framework.This should enable the fostering of future works and the understanding of possible areas of research.Indeed, whilst there is the need for literature on such topic, there are still few works that take these aspects into consideration.The rest of the work is structured as follows.First, we describe the method used to retrieve, select and classify the articles.In Section 3, we present the bibliometric analysis, and in Section 4 we analyze some of the works on performance modeling.Finally, we classify these works (Section 5) and we draw our conclusions (Section 6).

Objectives and Research Questions
This study provides a systematic review of literature [11] with a twofold objective: ─ to identify, classify, and summarize existing research on performance modeling; ─ to identify areas and opportunities for future research.
In particular, the study will answer to the following research questions: RQ#1: What is the trend of the topic?RQ#1.1 How many papers have been published on the topic per year?RQ#1.2How many authors contributed per year to the topic?RQ#1.3How many citations do these papers have per year?RQ#2: Which are the authors that contributed and contribute the most to this topic, i.e. which are the core authors and which the current authors?RQ#3: Which are the most cited works?RQ#4: What is the degree of awareness of the community (i.e., are the authors of these works aware of other similar works)?RQ#5: What is the broadness and the level of granularity of the analysis (e.g., individual enterprises, collaboration, goals, Key Performance Indicators (KPIs) or Process Performance Indicators (PPIs))?RQ#6: How many methods and methodological approaches have been used?RQ#8: What is the level of formalization and of complexity of the models?After answering to these questions, we will analyze more in detail the 5 most cited works.

Selection process
For the purpose of this study, we will analyze only works related to the modeling of performance associated aspects and we will exclude works concerning other modeling objects, although we recognize the importance of the other topics of enterprise modeling previously defined.In particular, we will focus only on works related to conceptual modeling, defined as "[…] the activity of formally describing some aspects of the physical and social world around us for purposes of understanding and communication […]" [12] and we will exclude works concerned with mathematical modeling, theoretical frameworks and so on.We also excluded works focused more on enterprise architecture than on the modeling by itself and tools oriented to business process monitoring with few modeling aspects of performance [13]- [17] and on non-formal representation of KPIs [18]- [22].
In order to achieve the before mentioned goal, we performed a five-phase analysis.

Keywords and databases selection.
In this study, the search process is exploited through a keyword-based search on a set of selected databases.The reasons for this choice lay in the implicit interdisciplinarity of the topic, which makes difficult to perform a systematic search based on personal reading from conferences and journals [23].
For what concerns the keywords, we identified three sets of them: the first concerning indicators, the second the model and the third the business domain.The keywords for the first set were selected accordingly to the terms used to describe the performance measurement phenomenon in the management accounting domain.The aim was to include all papers containing words, in the title, such as performance measurement, performance monitoring, Key Performance Indicator, KPI and enterprise monitoring.The second set of keywords is meant to restrict the search to conceptual models, thus it includes terms such as "modeling", "model", "formalization", "ontology", "ontological" and "semantic".We used the asterisks, such as in model* in order to ensure the inclusion of both plurals and similar words.
The selection of the terms to include in the first and the second sets followed an iterative process [23].With iterative process, we mean that after the first search on databases and on the bibliography of the selected papers [24] we refined the keywords used.
For what concerns academic databases, we decided to use Scopus [25] and Web of Science (WoS) [26].Indeed, we empirically observed that other databases, like Google Scholar [27], are characterized by a lower precision, a higher recall and do not add a significant amount of useful results to those obtained from Scopus and WoS.Moreover, the two databases we selected allowed us to ensure homogeneity in the queries and in the results and to reduce false-positives.
We selected broad bibliographic databases in order to include conferences proceedings, book chapters and journals.In particular, in Scopus are indexed the main conferences related to both Information Systems & Business Informatics (e.g., ECIS, ICEIS, HICSS, CBI, BIR) and Conceptual Modeling (e.g., ER and Models).Indeed, in the computer science field, conference papers are the "major source of publications" ( [23], p. 247).

Search on databases and duplicate checking.
For the search on the bibliographic databases, we connected the keywords with operators, in order to compose the following query, TITLE ( ( "Enterprise monitoring" OR "performance monitoring" OR "performance measurement" OR indicator OR "KPI*" ) AND ( ontolog* OR semantic OR modeling OR model OR formal* ) ) AND TITLE-ABS-KEY ( enterprise OR "Supply chain" OR organization OR "collaborative network" OR "supply network" OR "alliance" OR "virtual enterprise" ) which is also explicated in Table 1.As observable from the query and the table, we selected only works strictly related to the business domain, excluding works related, for example, to the performance of software.
From this search, we retrieved 177 records on Scopus and 123 record on Web of Science.Both records lists were then imported into a JabRef [28] joint database, on which we performed a duplicate check.After eliminating the duplicates, we got a total of 214 records coming from both Scopus and Web of Science.

Selection of papers.
Starting from the dataset of 214 papers, we selected 50 of them based on a title selection.In particular, the analysis was aimed at excluding those works for which the keywords occurred in the titles, but were matched in a way that was not inherent to the scope of the review (e.g."model for improving performance).
On the selected 50 papers, we performed abstract and text based selection, getting to a result of 8 papers.For example, we excluded works focused more on enterprise architecture than on the modeling by itself and tools oriented to business process monitoring with few modeling aspects of performance (e.g., [13]- [17]) and on non-formal representation of KPIs (e.g., [18]- [22]).

Backward and forward references search.
In order to avoid the limitations coming from a keyword search performed on two databases, we proceeded with a backward and forward references search [29], [30].In other words, we examined the bibliographies and the citations of the papers selected in the third phase, in order to include other relevant articles not reported by databases.From this process, built with a systematic personal reading selection [23] of titles and abstracts, we retrieved, respectively, 21 papers (from the analysis of the bibliographies, i.e., backward references search) and 31 paper (from the analysis of the citations, i.e., forward references search).
After we checked for duplicates between the different datasets (i.e., the one resulting from the second phase, the one coming from the bibliographies and the one from the citations), we recovered an overall of 30.Based on the analysis of abstract and text, as previously described, we selected 17 relevant papers.Processing of the dataset.The total dataset analyzed, therefore, consists of 25 papers, 8 of which coming from the phases one and two and 17 coming from the third phase.The papers were all imported into a unified JabRef database, which was later exported in XML in order to perform the bibliometric analysis (see Section 2.3).
For the purposes of this analysis, we adopted an approach focused on authors and on research groups [30].As we will outline, this allowed us to have a broader perspective on the analyzed models.

Criteria for bibliometric analysis and classification
With the bibliometric analysis, we aim at answering to the research questions number 1, 2 and 3, addressing, namely, the trend on the topic, the most prolific authors and the most cited works in the area.Therefore, the bibliometric analysis concerns several aspects, such as the number of papers published per year, the number of authors and authorships, and the number of citations per year, etc.The process of analysis is described in the detail in Section 3.
Moreover, all the selected works are classified according to four dimensions: ─ the modeling method, depending on the use of domain specific modeling languages or ontologies; ─ the object of analysis, depending on the focus on KPIs or PPIs; ─ the extent of analysis, i.e., whether the model takes into account only indicators or also the organization in a broader sense, including the analysis of goals; ─ the level of granularity in the analysis of KPIs.In particular, we will distinguish between works that focus on single enterprises and works that take into account an interorganizational setting.The analysis of the relevant papers identified from the literature in accordance with these four dimensions, enables the categorization of the various models associated with performance measurement.The classification of these works facilitates the association of each model with a modeling technique and with an object of analysis.This also enables models, which have been developed for similar purposes to be compared and be tracked over time and to identify the progress associated with a particular object of analysis.
Furthermore, in order to classify ontologies we adopted some additional criteria related to the ontologies peculiarities that we describe in the following.
Methodological approach.The criteria, inspired by [31], regards the language used for the development of the ontology and the evaluation process of the ontology.Indeed, it is difficult to distinguish between works designed with an inspiration approach, an induction one, a deduction one or a synthesis one, because the development process of the ontology is rarely made explicit.In this sense, we also take into account the language used to develop the ontology and the validation approach used.
Re-use of existing ontologies.One of the main advantages of the use of ontologies is the possibility to re-use existing knowledge in a specific domain [33].In this sense, it is important to understand if existing ontologies on KPIs re-use other ontologies in order to improve the model and, when this happen, how this is achieved [34].
Reasoning functionalities.Another peculiarity of ontologies is the reasoning support, which enables to infer new knowledge from the original domain body.This result can be particularly useful in the case of KPIs, since it can be used "to check inconsistencies among independent indicator definitions, reconcile indicators values coming from different sources, and provide the necessary flexibility to indicators management" [35].
Aim.This criterion regards the purpose for which the ontology has been developed and gains importance when an integration of different modeling approaches is needed.Indeed, the rationale underlying the conceptualization can have an impact on the definition of the concepts and of their relations, in particular in the case of applied ontologies.
Expressivity.This criterion, which we apply only to Description Logic (DL)-based ontology, aims at evaluating the ontology expressivity.Each ontology has a different level of expressivity, depending on the DL family and the specific constructors (e.g., conjunction, disjunction, negation, value restriction, …) that are used in the development of the ontology [36].The concepts and roles used for the conceptualization of the domain "are both described with terminological descriptions, which are built from pre-existing terms and with a set of constructors […].The choice and combination of the different constructors permit designing different DL language" [37].
Finally, we analyze the cross-references among the examined papers in order to assess which works have been stimulated and inspired by which others.

Bibliometric analysis
Many interesting insights can be gathered from the dataset on performance modeling.The first one concerns the relationship existing between papers and their authorships.Figure 1 summarizes the following elements: papers, authors and authorships (where each authorship accounts for a person who has authored a paper in that year).

Figure 1. Publications per year
In order to better understand the trends in the relevance of performance modeling, we analyze the number of citations per year, as depicted in Figure 2.For this purpose, we took into account the number of citations reported in Google Scholar, since Google Scholar take into account a broader set of works when it comes to the citation count.In particular, we examined the number of citation obtained by the set of papers in each year, without excluding the cross-citations among the selected set of papers.If we jointly consider the number of published papers (Figure 1) and the number of citation per year (Figure 2), it results that, although the production in the area has a constant trend, it is gaining more interest from researchers working in other domains.
Moreover, we present the Core Authors (CAs) and the Current Authors (CuAs).The first category gathers those researchers having published at least three papers in the last 10 years (i.e., in the period 2005-2014).The category of CuAs gathers those researchers who have published at least two papers in the last three years (i.e., in the period 2012-2014).Table 2 and Table 3 describes, respectively, CAs and CuAs in terms of average number of papers per year ( ) and average number of coauthors per year ( , computed taking into account the respective ranges of times (i.e., ten and three years).We believe in the importance of these two parameters.The former one allows us to understand how prolific these authors are in the specific field of analysis.The latter one quantifies the authorship network of each CA and CuA (i.e., how large is her/his collaboration network).The average numbers of papers and co-authors per year have been computed on a 10 years basis Finally, for what concerns the analysis of the most cited approaches we look at cumulative citations.With cumulative citations, we mean that we did not look at each work as a stand-alone paper, since many of them are published by the same authors and represent evolutions of the same work.Also in this case, from the analysis we did not exclude the cross-citations among the selected papers.In Table 4 we show the five most cited approaches.The average numbers of papers and co-authors per year have been computed on a 3 years basis

Performance Measurement Modeling
The most cited approaches fall all in the field of Domain-Specific Modeling Languages (DSML), which are used to model artifacts in a specific domain of analysis, such as the enterprise one.Some authors focused their attention on DSML for performance measurement.The aim is to offer models able to support the creation and the effective and efficient interpretation of "performance measurement systems […] by providing differentiated semantics of dedicated modeling concepts and corresponding descriptive graphical symbols" [1].

Pourshaid.
In [38] a framework for business process monitoring through User Requirements Notation (URN), extended with Goal-oriented Requirements Language (GRL) and Use Case Maps (UCM) is proposed.In particular, the authors model the meta-type Indicator, that belongs to an IndicatorGroup through which it is possible to aggregate KPIs and to offer different views and perspectives to user.Finally, each KPI have a target value (users' expectations), a threshold value and a worst value.

Popova.
Similarly, Popova and Sharpanskykh [39] developed a framework for modeling KPIs and their relations through dedicated first-order sorted predicate logic-based modeling language, while temporal relations are expressed through Temporal Trace Language (TTL).In order to model indicators, they take into account different views of an enterprise, namely, process-oriented, performance-oriented, organization-oriented and agent-oriented.In [39], the modeling language is represented by means of circles (for KPIs) and directed edges (for relations).Authors take into account some types of relations among KPIs such as causing, correlated and aggregationOf.As relations among KPIs, tasks, goals, processes, roles and agents, they analyze is_defined_over, is_based_on, uses, has_owner, measures and env_influence_on.The authors focus their study only on processes; however, they do not say how KPIs are calculated and do not propose a specific DSML.

The Business Intelligence Model (BIM).
With similar intentions in [40] a Business Intelligence Model (BIM) is used in order to model the strategy, the related goals, the indicators and the potential situations (Strengths, Weaknesses, Threats and Opportunities).In particular, for the modeling of KPIs and their relations the Semantics of Business Vocabulary and Business Rules (SBVR) proposal is used.The SBVR definition is then translated in OCL4OLAP, an extension of OCL, thus allowing the query of models with a multidimensional structure.In [41], techniques and algorithms to define KPIs metrics expression and value are developed.

MetricM.
In [42] and [43], a method for the design and use of indicator systems is proposed.In more detail, the modeling of the indicator system is achieved by a DSML named ScoreML.This method should satisfy some requirements such as the possibility to design comprehensive and consistent indicator systems in order to offer users a support on the interpretation of indicators and on the understanding of their rationales.Also, the method should support the representation of indicators at different levels of abstractions, depending on the user type, and enable the construction of software systems.These requirements can be satisfied through a DSML, which can be embedded in enterprise models.
In the model, each Indicator has some attributes (e.g., name, description, purpose) and can be computed from or be similar to other indicators.The users of the system can also define other cause-effect relation between indicators by means of the meta type CustomizedRelationship.Also, each indicator can be linked with a Goal, a DecisionScenario.A SpecificIndicator, that is an indicator type, can be associated with a ReferenceObject, which can be a BusinessProcess, a Resource or a Product.
In [1] a model for enabling reflective performance measurement, namely MetricM, and a domain-specific modeling language, named MetricML, are offered.After an analysis of the requirements that a method apt to support the reflective design and use of performance indicator should satisfy, the key concepts of MetricML and its semantics are identified.
In this model, the Indicator concepts describes the characteristics of a performance indicator.In particular, the meta-type is described by a formula, a UnitOfMeasurement, a TimeHorizon, a sourceOfRawData, a frequency of measurement, a purpose and an IndicatorCategory, which is useful in order to structure large sets of indicators according to user-defined criteria.Furthermore, MetricM takes into account also inter-indicator relations, such as TransformsRelation and IndicatesRelation, indicator-context relations, through the integration of MetricML with the MEMO modeling languages, and the indicator-goal relation.Indeed, MetricML falls into a comprehensive research work on multi-perspective enterprise modeling (MEMO) [1], [2].In the framework of a multi-perspective enterprise model, MML (Meta Model Language) has been defined, through which it is possible to model software engineering, social, managerial and economic aspects of the firm.

Wetzstein.
In [44], the authors define an ontological model for KPIs in the perspective of Business Process Management (BPM) and, more in detail, of Business Activity Monitoring (BAM), which enables the continuous monitoring of processes' performance.The monitoring is also enabled by Semantic Business Process Management (SBPM), which supports the semantic specification of business objects such as the inputs, outputs, preconditions and postconditions of activities.The same semantic specification can be used to model KPIs in the SBP modeling phase as KPIs are also based on business objects.
The model for KPIs is defined by means of WSML, through which is possible to manage some basic mathematical operations, thus KPIs formula.In particular, each KPI has some attributes, such as name, description, targetValue and analysisPeriod.The KPI can deviate from the target value in a certain valueRange: if the deviation does not happen in that range, then an Alert is sent.Also, KPIs can have aggregated metrics or instance metrics.On turn, the latter can be: ─ StateMetric, in which case it evaluates a condition with a true or false statement; ─ DurationMetric, in which case it evaluates the time interval between two activities.Moreover, the instance metrics can be composed by means of operators (max, min, average and sum) in order to compose aggregate metrics, which have a composition expression.

Classification, Results and Discussion
As depicted in Figure 3, the works previously analyzed have been categorized according to the modeling method, the object and extent of analysis and the level of granularity.On the horizontal axis, works are subdivided in accordance with the method used in order to model performance indicators: DSML or ontologies.On the vertical axis, is shown the object of analysis, which can be the performance of processes (PPIs) or the overall performance of the organization (KPIs).Moreover, the shape of the figures represents the goals analysis: the works that take into account the organization's goals are represented with a square shape, while the others are shown as round shapes.Also, the color exemplifies the organizational granularity: models developed for single enterprises are shown as yellow shapes, while models developed for collaborative enterprises are shown as red shapes.From the literature analysis and from the classification, it resulted that there are still few works regarding performance modeling and no works at all that use ontologies in order to conjunctly analyze goals, KPIs and the collaboration among enterprises, which is gaining more and more importance.Indeed, on one hand, the relevance of jointly accounting for and linking KPIs and goals is well known in literature [45], [46].On the other hand, as stated in [9] IS should "enable new forms of participation and collaboration, catalyze further the formation of networked enterprises and business ecosystems […] ushering in a new generation of enterprise systems".In this sense, works that consider these issues are particularly useful since in order to develop IS it is important to perform a conceptual modeling phase concerning the domain of analysis [10].Ontologies can be particularly useful to this aim since complex information systems rely on robust and coherent, formal representations of their subject matter.In this sense, ontologies can provide models of different aspects of a business entity contributing to intra-and inter-enterprise information systems.By committing to the same ontological specification, different applications share a common vocabulary with a formal language and clear semantics.Also, by representing knowledge with a well-established formalism [47], internal consistency and compliance checking can be performed in order to determine content adequacy.
Furthermore, in Table 6 a specific classification for the works that propose ontologies for indicators is offered.From the analysis of the table, it resulted that most of the ontologies developed in this field do not enable reasoning functionalities or, at least, they do not present this kind of result, even though this could be a useful functionality when it comes to interoperability among different organizations.Also, the expressivity is not shown, which can be a problem if someone is interested in evaluating "what" the ontology does, even without having access to the whole ontology.In some cases, authors do not even explicit the language in which the ontology is written and in most of the cases the ontology is not available online.Moreover, existing ontologies are rarely re-used, except for the Enterprise Monitoring Ontology that anyway recalls only ontologies not specifically related to indicators.In this sense, future researches could compare in detail existing ontologies and domain modeling specific languages in order to define a core ontology for indicators, which can then be enriched with domain ontologies.This could be useful also considering that some works could benefit from useful hints coming from similar projects.In Figure 4 and Table 5, the cross-citations between the papers analyzed are depicted.

Conclusions
In this work, we present a systematic review and a bibliometric analysis of literature with the aim of identifying, classifying, and summarizing existing research on performance modeling and of identifying areas and opportunities for future works.In order to do so, we classified all works according to criteria such as the modeling method, the object and the extent of analysis and the level of granularity.Moreover, we classified ontologies taking into account other criteria such as the methodological approach, the re-use of existing ontologies, the reasoning functionalities, the aim and the expressivity.
From the analysis of the retrieved works according to the criteria previously defined, it resulted that still few works take into account performance modeling with respect to the joint analysis of goals, KPIs and collaborative enterprises.In this field, ontologies can be particularly useful in order to model the domain of analysis, since they enable reasoning functionalities and guarantee a formal representation of the domain.In this sense, it would be useful to have more works that focus on what knowledge can be inferred from the domain, since only few current works show these aspects.Also, a possible direction for future research is the comparison of existing ontologies and domain modeling specific languages in order to define a core ontology for indicators, which can then be enriched with domain ontologies.
Future works will include the semantic, syntactic and structural comparison of the different models, in order to understand whether these models represent different perspectives and can be integrated.

Figure 2 .
Figure 2. Number of citations per year

Figure 3 .
Figure 3. Classification of the works on performance modeling

Figure 4 .
Figure 4. Citations between selected works

Table 1 .
Method adopted for literature review * = regex to show all words that starts with the previous letters

Table 2 .
Core Authors

Table 3 .
Current Authors

Table 4 .
Most cited modeling approaches

Table 6 .
Classification of ontologies