A corpus linguistic study of constructional equivalence for the Indonesian translation of ROB and STEAL based on the Open Subtitles Parallel Corpus

Catford’s classic idea in translation theory indicates the measurability of translation equivalence. Following up on this idea, this paper offers a case study to measure the translation equivalence of English verbal near-synonyms ROB and STEAL (R&S), especially the equivalence at the constructional level. Adopting a quantitative corpus linguistic method and the Construction Grammar approach, we analyse random usage samples of R&S from English-Indonesian parallel corpus for the degree of constructional equivalence along two dimensions: (i) the profiled participant roles and (ii) the grammatical construction types of these verbs. We discover that the Indonesian translations maintain a high degree of equivalences along these dimensions, albeit with few variations. This suggests that the translators attempt to be as faithful as possible to the source texts. Furthermore, our study reveals the translation norms/typicality in how the constructional profiles of the near-synonyms R&S are translated into Indonesian. The paper generally seeks to demonstrate how such a central notion as equivalence in translation studies can be investigated using parallel corpora and the quantitative corpus linguistic method.


Introduction
The recent explosion of digital data in all fields has positively impacted the broader field of linguistics, especially on the availability of large collection of digitalised samples of textual data, which is also referred to as the language corpus (or corpora, for plural). Nowadays, there are two major types of language corpora as far as the language is concerned. The first type is monolingual corpora that contain texts from a single language (e.g., British English language corpora). The second one is bi-/multilingual, parallel corpora containing "original texts in language A vs. their translations in language B" (Mikhailov & Cooper, 2016, p. 5) (e.g., corpus of an original novel and its translations in different languages).
The past two decades have seen a rapid growth of corpus-based translation studies as a branch of translation studies (Baker, 1993;Hu, 2016). Corpus-based translation studies share the same philosophy with corpus linguistics as it embraces empiricism and descriptive focus over the prevailingly predominant paradigm of prescriptivism in translation studies. The prescriptive approach of translations "assumes the primacy of the source text and argues that the target text should seek to be as equivalent to the source text as possible" (Hu, 2016, p. 1;Zanettin, 2014, p. 180). In doing so, the prescriptive approach "relies heavily on intuition, anecdotal evidence, or a small number of samples" (Hu, 2016, p. 1). The corpus-based approach, in contrast, focuses on "describing the features of translation and translational norms" to reveal "the nature of translation and the interrelationship between translation and social culture, based on statistical analysis of a wealth of corpus data" (Hu, 2016, p. 1).
In this paper, we embrace the quantitative, empirical foundation of corpus linguistics to investigate translation equivalence and norm ( §1.1 below) in the samples of Indonesian translated texts from English movies and TV subtitles ( §2.1). As a case study, we focus on a pair of synonyms of dispossession verbs in English, namely ROB and STEAL (R&S), and the Indonesian equivalence of the verbs' constructional properties (i.e., profiled participant roles and construction types; see §1.2 for further details on these points).

A brief overview on the translation equivalence and norm
Equivalence is a fundamental notion in translation theories (Baker, 1993;Kenny, 2001). From the perspective of the equivalence-based theory of translation, equivalence is understood as "the relationship between a source text (ST) and a target text (TT) that allows the TT to be considered as a translation of the ST in the first place" (Kenny, 2001, p. 77). In this paper, following Toury (1980, p. 39, as cited in Kenny, 2001, p. 79), we view equivalence as an empirical category that could be established after the translation: "actual relationships between actual utterances in two languages (and literatures), recognised as TTs and STswhich are subject to direct observation" (Kenny, 2001, p. 79).
With this view of equivalence as an empirical category and the use of parallel translation corpus, we can measure and potentially establish equivalence after the fact based on the evidence available in the corpus. Furthermore, Toury (as cited in Kenny, 2001) notes that the study of equivalence should not focus on (i) whether the two texts are either equivalent or not, which is essentially a prescriptive view (Hu, 2016, pp. 5, 140), but (ii) the kind and degree of equivalence. The parallel corpus allows us to inspect not just a single or two examples, but a set of samples of utterances that can be analysed both qualitatively and quantitatively, allowing us to measure the degree of equivalence between the utterances in ST and TT.
The view that equivalence could be measured in probabilistic terms goes back to Catford's (1965) classic idea. The probability of using a given item (in comparison to the other items) as the translation of X can be recast as a translation rule or "norm" (Toury, 2000). Following Baker (1993, p. 239), we view translation norm as "typicality" emerging from works in corpus-based lexicography. Translation norms and equivalence can be identified by inspecting not just an individual translation of a linguistic unit, but a set of corpora of the source and the translated texts. Investigating a set of samples enables us to record and quantify which pattern of translation is opted for more frequently (i.e., used repeatedly) in preference to the other patterns for a given unit of analysis in each culture (Baker, 1993, p. 240).
Furthermore, statistical method derived from large corpus underlies the development of datadriven, machine translation (Kenny, 2020). The quantitative information concerning the constructional patterns (e.g., argument structures) of a verb derived from a parallel corpus can be used as input-datasets for building translation models by the computational machine learning techniques implemented in statistical and neural-network machine translation (Kenny, 2020, pp. 307-308). The "machine" (i.e., computational method) learns the translation model by means of the distributional patterns/tendency (e.g., argument structure, construction types) that a given verb appear in the corpus. This approach is in line with the usage-based approach to meaning (originally promoted in Wittgenstein, 1953;and Firth, 1957) that views the meaning of a linguistic form is constructed based on its morphosyntactic, semantic, and lexical co-occurrence properties (see Stefanowitsch, 2010, pp. 368-370, for the overview). This idea of meaning as co-occurrence networks of linguistic forms is adopted in Catford's proposal for the theory of meaning in translation called "the total network of relations entered into by any linguistic form" (Catford, 1965, p. 35). One example of the "relations" of a linguistic form relevant in this study is the "formal relations" of the form, such as the grammatical construction of the verb.
To summarise, corpus-based, quantitative data derived from a sample of translated texts in a parallel corpus allows us to measure such central notions as equivalence and norms in translation studies. This paper is an attempt to provide a modest example of the study of equivalence and norms at the level of grammatical, syntactico-semantic construction of verbal synonyms.
Construction Grammar (CxG) theory (Goldberg, 1995(Goldberg, , 2013Hilpert, 2020) has incorporated a central apparatus in Frame Semantics theory (Fillmore, 2014) to provide a unified model in capturing the meaning of verbs; this apparatus is the concept of semantic frame. A semantic frame is a rich, encyclopaedic knowledge of "situation type describable in terms of the kinds of relations, situations or sub-events 'evoked' in the minds of those who know the language" when they use or hear lexical items belonging to a given frame (Fillmore, 2014, p. 126). In Frame Semantics, a semantic frame consists of the so-called Frame Elements, which represent semantic participant roles in the situation related to the frame (Fillmore, 2014, p. 123;Thorgren, 2005, pp. 3-4). From the perspective of CxG, therefore, the meaning of verbs is defined relative to the evoked semantic frames and frame elements.
In the case of R&S, following what is presented in the FrameNet repository, ROB evokes the socalled Robbery frame while STEAL evokes the Theft frame (but see Thorgren, 2005, pp. 6-7 that groups R&S under the broader domain of POSSESSION since the two verbs evoke a situation of taking someone's possession without permission). Even though R&S belong to different semantic frames, their meanings involve the same configuration of participant roles: THIEF (or PERPETRATOR), GOODS, and TARGET (i.e., VICTIM or locational SOURCE) (Goldberg, 1995;Stefanowitsch, 2011). Goldberg (1995, pp. 44-45) further notes that R&S differ in terms of which of these participant roles are lexically profiled (i.e., obligatorily expressed) in the utterances with R&S. ROB profiles the TARGET and syntactically realise this role as the direct object, while STEAL profiles the GOODS (Goldberg, 1995, pp. 45-46;Dux, 2011, p. 22;Stefanowitsch, 2011, p. 263). Further qualitative difference between R&S proposed is that ROB exerts a high degree of negative affectedness on the TARGET/VICTIM while STEAL does not (Goldberg, 1995, p. 46). This paper will further assess the proposed difference in profiling of R&S using different genres of data from the previous studies (see §3.2). These constructional profiles of R&S will then be taken as the linguistic domain for which the verbs' equivalence will be measured in the Indonesian translations ( §3.2.1 and §3.3), following Toury's suggestion to focus on measuring the degree of equivalence.

The corpus data
The data for this study comes from the OpenSubtitles v2018 (hereafter OSub) parallel corpora (Lison & Tiedemann, 2016), especially the English-Indonesian sub-corpus. The whole OSub parallel corpora are built out of a large, open-source database of movie and TV subtitles. The latest version of OSub (v2018) contains over two billion sentences across sixty-two languages. The English-Indonesian sub-corpus used in this study consists of 9.7 million aligned sentences between English (as the source texts) and its Indonesian translation (as the target texts). The sentence in each language holds in total 72.8 million word-tokens for English and 60.9 million word-tokens for Indonesian. We downloaded the English-Indonesian corpus file in. tmx format (1.31 GB) the content of which is shown in Figure 1.  Figure 1 shows that (i) each English sentence appears immediately above the Indonesian translation and (ii) each sentence is tagged with language identifier (i.e., <tuv xml:lang="en"> for the English sentences and <tuv xml:lang="id"> for the Indonesian translations). We designed a programming script with R (R Core Team, 2020; see Rajeg et al., 2021b, for the R script) to separate the English and Indonesian sentences in the .tmx file into two plain text files (.txt). These plain text files then became the input data for generating a parallel concordance (i.e., parallel Keyword-in-Context display) (see Figure 4). The R programming script maintains the alignment between the source English sentences and their target Indonesian translations in the two files (see Figure 2). One can see from Figure 2 that, for example, the English sentence number 3097 in the eng_OpenSub.txt file (i.e., I am going to steal the guy) corresponds to its Indonesian translation in sentence number 3097 in the id_OpenSub.txt file (i.e., aku akan mencuri pria itu). This is what is meant by sentence-aligned parallel corpora. From these two files, we retrieved sample sentences for various inflectional forms of ROB and STEAL (R&S) and their corresponding Indonesian translations.

Data retrieval
The data for this study represent the English sentences (and their Indonesian translations) containing the various inflectional verbal forms referring to the lemma (i.e., abstract form of) R&S ( Figure 3). The occurrences of the verb forms referring to ROB amount to 1,870 tokens, which is highly significantly lower than the total occurrences of the verb forms for STEAL, amounting to 10,715 tokens (X 2 goodness of fit = 6216,5; df = 1; p < 0.001). The ratio of ROB to STEAL is 1:6. Figure 3 Token frequencies of the inflectional forms of ROB and STEAL in the English-Indonesian OSub sub-corpus.
We analysed two sets of data: (i) one is a random sample of 150 sentences with forms referring to the lemma ROB and the corresponding 150 sentences of their Indonesian translations (300 sentences in total for ROB), and (ii) another random sample of 150 sentences with forms referring to the lemma STEAL and the corresponding 150 sentences of the Indonesian translations (300 sentences in total for STEAL). In total, we analysed 600 sentences from the two languages.
The concordance (Keyword-in-Context display) technique in corpus linguistics is used to bring the data in a more accessible, tabular format. We generated parallel concordances for the R&S samples whereby their forms become the central keywords surrounded by their preceding (left) and following (right) contexts; the Indonesian translations are represented in a separate column (see Figure 4). We used the para_conc() function from the paracorp R package (Rajeg, 2021a) designed to create parallel concordances from parallel corpus inputs. The output of para_conc() is a tab-separated plain text file that can be imported into spreadsheet software. Figure 4 Snippet of the parallel concordance for ROB generated by the para_conc() function in the paracorp R package (Rajeg, 2021a) and viewed in the LibreOffice Spreadsheet.
The parallel concordance is not the results of data analyses. It is only a facet of the data collection procedures. The concordance organises the data in a format that facilitates qualitative and quantitative analyses in relation to the research problems, which are the topic of the following sub-sections.

Aspects of data analysis
In this sub-section, we outline the qualitative and quantitative aspects of the data analyses on the parallel concordances for R&S and their Indonesian translations.

Annotated qualitative variables
The qualitative aspect of the analyses includes manual annotations of the syntactic, semantic, and lexical features for each use of R&S in the concordance samples (see Table 1 for the summary). At the lexical level, we identified the Indonesian words and word classes used as the translation of R&S in each sample sentence; this is not discussed in more detail here ( §3.1) but in another publication (Rajeg et al., 2021a). Moreover, we manually extracted the syntactic collocates of R&S and their translations. These collocates represent lexical items filling in the participant roles of R&S in the constructions; these roles are THIEF, GOODS, and TARGET; the TARGET role can be HUMAN/VICTIM or LOCATION (Goldberg, 1995, p. 46;Stefanowitsch, 2011, p. 260;Dux, 2011, pp. 18, 22).
At the constructional level, we focused on two main analytical aspects. The first one is the overt expression of the three participant roles of R&S (and their translations) in the samples ( §3.2). If each participant role was linguistically encoded (i.e., explicitly mentioned) and profiled (i.e., mapped onto the core syntactic arguments) in the sentence, it was coded as TRUE, otherwise, it was coded as FALSE (Stefanowitsch, 2011, pp. 262-263). This analysis relates to the research problem concerning the profiled participant roles of R&S and the degree to which such profiling is maintained in the Indonesian translations (see §3.2.1).
The second aspect of the constructional analysis is the type of constructions of R&S (and their Indonesian translations) ( §3.3). The construction type, in this case, captures the mapping of the participant roles, especially GOODS and TARGET, onto the core arguments of the two verbs, especially the direct object (in an active sentence) and subject (in a passive sentence) arguments. The focus on these two participant roles and their mappings onto the two arguments in two different grammatical voices is motivated by previous works on R&S arguing that the two verbs differ regarding which participant (esp. GOODS and TARGET) is profiled (i.e., mapped onto the direct object argument) and which one can be unexpressed explicitly in syntax (Goldberg, 1995;Stefanowitsch, 2011;Dux, 2011;Glynn, 2004). Consider the following attested examples of R&S in active sentences found in British English online writings (e.g., mainly online news and blogs).
( In (1)a, the TARGET role of ROB is profiled as it is mapped onto the direct object (a core argument), while the GOODS is expressed in an adjunct prepositional phrase and can be left out completely as shown in (1) (2), STEAL profile the GOODS as it is mapped onto the direct object ((2)a) while the TARGET can be optional ((2)b). Example (3) provides a clear contrast between ROB and STEAL in terms of which participant role they tend to profile, given these verbs now co-occur in the same sentence: ROB profiles the locational TARGET (Robert Burn's grave), with the human TARGET is expressed in possessive/genitive 's modifier (i.e., Robert Burn's), while STEAL profiles the GOODS (skull) associated with the locational TARGET. In the database, we labelled these construction types as "TARGET-OBJECT" and "GOODS-OBJECT" (for the active voice usage of R&S), and "TARGET-SUBJ-PASS" and "GOODS-SUBJ-PASS" (for the passive voice usage of R&S). We are interested in measuring the degree to which these constructions for the TARGET and GOODS are preserved in the Indonesian translations. In a way, this aspect is like what Baker (2017) calls "equivalence above the word level".
The final level of analysis is semantics and related to the two roles of interest (GOODS and TARGET), namely (i) the animacy of the TARGET, and (ii) the semantic types of the GOODS and TARGET roles. In terms of the animacy of the TARGET, we determined as to whether the TARGET is (i) animate and/or sentient, hence showing a more specific role of VICTIM, or (ii) location, thus LOCATION TARGET. The motivation behind this analysis is Goldberg's (1995, pp. 46-47) proposal that ROB semantically exerts a higher degree of negative affectedness on the TARGET, suggesting the higher likelihood of animate and/or sentient TARGET. As for the semantic types coding, we adopted the categorisation proposed by Fernández-Martínez and Faber (2020). This semantic annotation is motivated by the hypothesis that the GOODS of ROB tend to be of high value and precious, which is not always the case for the GOODS of STEAL (Dux, 2018, p. 325). The semantic analyses will not be discussed in this paper but are part of the upcoming one related to this project (see §4 below). In the parallel concordance table, each of these variables were annotated in separate columns. The fully annotated database would then become the input for the statistical analyses discussed next.
As mentioned in §1, this paper measures the degree of the constructional equivalence of R&S in Indonesian, involving equivalences at the profiled participant roles ( §3.2 and §3.2.1) and construction types ( §3.3). To measure the degree of equivalence for the profiled participant roles (variables 9-14), for instance, we cross-tabulate how many times does the profiling of a role (e.g., THIEF) in English is maintained in its Indonesian translation (i.e., ENG_THIEF: TRUE and IDN_THIEF: TRUE) vs. the deprofiling of the role in the translation (i.e., ENG_THIEF: TRUE and IDN_THIEF: FALSE), and vice versa (e.g., the de-profiling of a role in English but profiled in Indonesian). These distributions were evaluated for statistical significance (using the Chi-Square Test for Independence or the FYE test) and for the size of the effect (Cramer's V or Phi-Coefficient).
For the second aspect, namely measuring the equivalence at the construction-type level (see variables 15-18), the same approach is adopted as in the profiled participant roles. For instance, given the use of the TARGET-OBJECT construction for ROB in English (cf. example (1)a), to what extent would this same construction be used in the Indonesian translation compared to the other construction types? To answer this question, we run a series of FYE tests á la "Collostructional Analysis" (Stefanowitsch, 2013;Stefanowitsch & Gries, 2003) for each construction-type co-occurrence in English and Indonesian to determine which construction-type co-occurrence is strongly preferred between the source English text and the Indonesian translations of R&S.
The complete annotated databases, and the R programming codes for the statistical analyses and visualisations, are available open access (see Rajeg, 2021b, for the download URL). The following R packages are also used for the study: tidyverse , vcd (Meyer et al., 2006), readxl , and RColorBrewer (Neuwirth, 2014).

Profiled participant roles of ROB and STEAL
We adapted Stefanowitsch's (2011, p. 262) corpus-based approach to investigate the profiled participant roles of R&S. If a given role is explicitly mentioned in the sentence and mapped onto the core syntactic argument of the verbs (i.e., the roles are lexically profiled in the sense of Goldberg, 1995, pp. 44-45) In (4), the THIEF (I) and the locational TARGET (banks) are explicitly mentioned and profiled; hence the THIEF and TARGET were coded as TRUE while the GOODS received FALSE coding. In contrast, example (5) does not explicitly mention the THIEF and the GOODS, but only the locational TARGET (grocery store), which is also profiled at the same time as the direct object of ROB. Example (6) also omits the THIEF but mentions the GOODS (dollars) in a non-core argument slot (i.e., prepositional phrase) and the locational TARGET (store). We calculated the frequency of explicit mention for each role in the core syntactic arguments. Which participant role is strongly profiled by R&S? Is there an association between a certain role and one of the two verbs, as suggested in the previous studies (Goldberg, 1995;Stefanowitsch, 2011)? Figure 5 summarises the results of this analysis. The results shown in Figure 5 are in line with Goldberg's (1995) introspective proposal and Stefanowitsch's (2011) preliminary corpus-based findings on the profiled participant roles for R&S in English. The Chi-Squared test for Independence indicates that there is a highly significant asymmetry (X 2 =222.09; df=2; ptwo-tailed < 0.001) with strong/robust effect size (Cramer's V=0.73) in the profiling of the participant roles of R&S. The specific effects are shown by the stark differences in the profiling of the TARGET (profiled by ROB) and the GOODS (profiled by STEAL); the THIEF, meanwhile, is nearly always profiled by R&S. The association plot in Figure 6 visualises more intuitively the direction and strength of the association between the participant roles and R&S. The rising, bluish rectangles indicate positive associations while the falling, reddish rectangles indicate negative associations, the darker the shadings, the stronger the effects. The prominent effects are shown by the strong preferences of ROB to profile the TARGET, and of STEAL to profile the GOODS; these provide further empirical evidence based on movie/TV subtitles corpus for the distinct profiled participant roles of R&S as proposed by Goldberg (1995) and Stefanowitsch (2011). The following sub-sections discuss the degree of equivalence for the profiled participant roles and their Indonesian translations.

Degree of equivalence for the profiled participant roles of ROB and STEAL
In §3.2 above we have presented the degree to which a participant role in the R&S semantic frame is profiled, and found strong, positive associations between the TARGET and ROB, and between the GOODS and STEAL (Figure 6). The THIEF role does not show any strong preference as it is nearly equally frequently profiled by R&S. In this sub-sub-section, we present the degree of equivalence for the profiled participant roles of R&S in their Indonesian translations. A participant role is considered equivalence in its profiling if that given role is explicitly mentioned in the core argument functions in the English source text as well as in the Indonesian translations. The greater the frequency of the profiling of a participant role in English and its corresponding Indonesian translation, the greater the constructional equivalence of that participant role. Such a high degree of equivalence may also suggest a low degree of translation shift and loss of information at the level of the participant role of the translated verbs. To illustrate this idea, consider the following examples.  (7) represents a constructional equivalence for the participant roles of STEAL. It is because all core argument participant roles of STEAL (the THIEF and GOODS) are preserved and profiled in the Indonesian translation, even though in the Indonesian translation, the GOODS is realised as an encliticised core argument with the third person suffix -nya. Moreover, the TARGET role of STEAL expressed in oblique preposition phrase in (7) is also mapped onto the same function in the Indonesian translation. Semantically, the use of STEAL in (7) is metaphorically interpreted as an abduction and thus rendered into menculik 'abduct' in Indonesian, given that the TARGET is understood as human.
Example (8), in contrast, represents a loss of participant role information. The TARGET, which is explicitly mentioned as the direct object of ROB in the English source text, is absent in the Indonesian translation. With respect to (8), note that RAMPOK, which is the norm for the Indonesian lexical equivalence for ROB (Rajeg et al., 2021a), occurs in a different construction pattern whereby it is the GOODS, rather than the TARGET, that is profiled onto the direct object core argument, which is the constructional profile associated with STEAL (see §3.3 for the evidence).

Degree of equivalence for the THIEF role
First, we look at the degree of equivalence for the profiling of the THIEF role, shown in Figure  7; the left panel is the data for ROB and the right one is for STEAL. Overall, R&S demonstrate a high degree of equivalences in the profiling and de-profiling of the THIEF role. For ROB, 98.8% of the total 85 cases where the THIEF is profiled in the English source text are also profiled in the Indonesian translation (cf. examples (7) and (8)). We can also look at this issue from a different perspective, namely, when THIEF is not profiled. When the THIEF is de-profiled in the English source text, the THIEF is also de-profiled in the Indonesian translations in 94.6% of the 37 cases (see (9)). These correlations are highly significant (pFisher-Yates Exact < 0.001) and represent robust effects (ϕ=0.94). The translation of ROB in (9) is a less prominent verb form other than RAMPOK, namely copet 'to pickpocket'. It might be opted for when the robbery happens as the TARGET, especially VICTIM, is on the street rather than inside a static location. In Indonesian, the event of copet happens on the street.
Next, we also identify a high degree of equivalence for the translation of the THIEF role of STEAL. The profiling and the de-profiling, of this role is highly significantly (pFisher-Yates Exact < 0.001) and robustly (ϕ=0.81) maintained in more than 90% of all the cases in the translations. The results for THIEF with R&S may suggest that the Indonesian translation attempts to be as faithful as possible as to whether the THIEF role is (de-)profiled in the English source texts.

Degree of equivalence for the GOODS role
The contrasting distribution in the (de-)profiling of the GOODS in Figure 8 below may not be surprising given the patterns shown in Figure 5 and Figure 6. That is, the GOODS role is strongly profiled by STEAL rather than by ROB. Here, we look at the extent to which these constructional profiling patterns of the GOODS are rendered in the Indonesian translations. From the total 108 cases when the GOODS of ROB is not profiled in the English source text, the Indonesian translations preserve this de-profiling in 98.1% of the cases (see the right maroon bar in the ROB panel) rather than explicitly profiling the GOODS role (1.9%) into the direct object slot (see example (10) below). In example (10), the GOODS (i.e., uang 'money') is explicitly added and profiled in the direct object slot in the Indonesian translation, meanwhile in the English source text, it is absent (but may already be understood from broader contexts and influenced by the constructional pattern of ROB which rarely profiles the GOODS; see Figure 5). On a very few occasions when the GOODS is overtly encoded and profiled by ROB (N=2), the Indonesian translations follow this 100%. Overall, we identify a highly significant (pFisher-Yates Exact < 0.001) and robust (ϕ=0.7) correlation between the (non-)profiling of the GOODS in the English source text and in the Indonesian translations, suggesting a high degree of equivalence at the constructional, non-profiled participant level of analysis.
For STEAL, we also identify a significant (p Fisher-Yates Exact < 0.05) and robust (ϕ=0.57) pattern of correspondence in the profiling of GOODS in the source text and in the Indonesian translations. In 98.2% of the total 110 cases when the GOODS is profiled in the English source text, it is also profiled in the Indonesian translations. As in the results for THIEF in §3.2.1.1 above, the Indonesian translators seem to be faithful with the English source text in terms of the (de-)profiling of the GOODS role.

Degree of equivalence for the TARGET role
Lastly, we discuss the degree of equivalence for the TARGET role. Like for the THIEF and GOODS roles, we found a high degree of equivalence in the (de-)profiling of the TARGET role in the English source text and in their Indonesian translations.  (1) and (3)), reflecting the profiled participant role of ROB ( Figure  5). The Indonesian translations significantly maintain this profiling of the TARGET in 91.7% of the cases (pBinomial < 0.001), suggesting a high degree of constructional equivalence and an attempt to be as faithful as possible to the source text. Furthermore, a significant (pFisher-Yates Exact < 0.05) and perfect correlation (ϕ=1) is shown in the (de-)profiling of the TARGET of STEAL. Given these conditions, the Indonesian translations of the TARGET of STEAL demonstrate a high degree of equivalence.

Construction types of ROB and STEAL and their degree of equivalence in Indonesian
This sub-section discusses the degree of equivalence at the construction-type level (see variables 15-18 in Table 1). We focus on the construction types related to the TARGET and GOODS roles that exhibit strong and highly significant interactions with R&S in the English source texts. For instance, given the use of the TARGET-OBJECT construction for ROB in English, to what extent would this same construction be used in the Indonesian translation compared to the other construction types? To answer this question, we run a series of Fisher-Yates Exact tests á la "Collostructional Analysis" (Stefanowitsch, 2013;Stefanowitsch & Gries, 2003) for each construction-type co-occurrence in English and Indonesian. As a result, we can determine which construction-type co-occurrence is strongly maintained between the English source text and the Indonesian translations of R&S (cf. Janda & Lyashevskaya, 2013, for a similar approach to different object of study). Following the standard procedure of multiple significance testing (McDonald, 2014, p. 257), we adjusted the significance level using Holm's method (Gries, 2009, pp. 242-243, 249) and will discuss the construction-type cooccurrences (between the English source and the Indonesian translations) that are significant at the adjusted level of pHolm < 0.001.

Constructional equivalence for ROB
In §3.2 we have shown that ROB and STEAL differ with respect to which non-THIEF roles are profiled (i.e., mapped onto the core, syntactic argument). Now, we will specify the construction type for the profiling of these non-THIEF roles and the extent to which a given construction type is rendered into the same construction type in the Indonesian translations. Consider the data for ROB in Figure 10. The construction types for the verbal use of ROB in the sample are shown in the horizontal axis (x-axis) while the distribution of the construction types in the Indonesian translation is shown by the bar height (i.e., vertical, y-axis) and the different colours. Figure 10 The construction types pertaining to the TARGET and the GOODS roles of ROB in English (x-axis) and the distribution of the construction types in the Indonesian translations (y-axis and the colour of the bars).
It can be observed that the predominant construction type for ROB in the Active Voice (AV) is the TARGET-OBJ(ECT) (N = 91) (cf. example (1)a), followed by the TARGET-SUBJ-PASS (N = 29) (example (9)) in the Passive (PASS) Voice. When ROB is used in the TARGET-OBJ construction, there are five different construction types used in the Indonesian translations (cf. the different bar colours for the TARGET-OBJ axis in Figure 10). The norm is that the Indonesian translations would highly significantly (p Holm < 0.001) and strongly (ϕ=0.81) preserve the same construction. This suggests a high degree of constructional equivalence for the TARGET-OBJECT construction of ROB in the Indonesian translations.
It is interesting to note that three out of the five occurrences of the GOODS-OBJ construction in the Indonesian translations of ROB from the TARGET-OBJ construction involve three different verbal lemmas, namely CURI (11), RAMPAS (12), and AMBIL (13). These lemmas are not the norm for the lexical equivalence of ROB in Indonesian, namely the lemma RAMPOK . Presumably, these different lemmas exhibit different constructional profile from ROB/RAMPOK so that different construction type is used in the Indonesian translations. However, this constructional shift from the TARGET-OBJ to the GOODS-OBJ construction is not statistically significant. Qualitatively, such shift may be felt necessary by the translators and suitable in the Indonesian contexts. Another qualitatively interesting constructional shift, though not statistically significant, is the shift from the TARGET-OBJ (English) into the TARGET-SUBJ-PASS construction (Indonesian). In all three cases, such a shift happens when the TARGET-OBJ construction of ROB in the source text occurs in the Object Relative Clause, namely that the direct object of ROB is the head of a noun phrase modified by the relative clauses headed by ROB. Example (14)  Then, the TARGET-SUBJ-PASS construction in the English sample is always translated with the same construction. The Fisher-Yates Exact test indicates that this is a highly significant (pHolm < 0.001) and a very robust pattern (ϕ=0.94) of constructional equivalence for ROB in Indonesian with respect to the TARGET-SUBJ-PASS construction. Figure 5 in §3.2 has demonstrated that STEAL always profiles the GOODS, rather than the TARGET. Figure 11 below visualises the distribution of the construction types in the Indonesian translations of STEAL, especially constructions pertaining to the GOODS and the TARGET roles. Figure 11 The construction types pertaining to the GOODS and the TARGET roles of STEAL in English (x-axis) and the distribution of the construction types in the Indonesian translations (y-axis and the colour of the bars).

Constructional equivalence for STEAL
The predominant construction type for STEAL, namely the GOODS-OBJ(ECT) (N = 100) (see example (15)), is highly significantly (pHolm < 0.001) and strongly (ϕ=0.7) maintained in 93% of all the cases in the Indonesian translations. This also suggests that there is a high degree of constructional equivalence for this construction.
Idn: ... seseorang mencuri [radio]GOODS-OBJ someone steal radio In the remaining seven percent of the cases, the GOODS-OBJ construction type underwent constructional shifts as it is translated using different construction types, namely GOODS-SUBJ-PASS (16) and the intransitive (17)  As in the case of ROB, the switch from the AV pattern GOODS-OBJ in English into the PASS GOODS-SUBJ-PASS in Indonesian (6 cases in total), is mostly triggered when the English source text appears in Object Relative Clause construction (4 cases) (16); the other two are interrogative sentences (18). Thus, it is interesting to see similar specific pattern of translation with STEAL. In example (18), the interrogative pronoun what represents the direct object of STEAL linked to the GOODS role. Our use of PASS in the GOODS/TARGET-SUBJ-PASS construction glosses over two passive-like constructions in Indonesian, the canonical passive with the verbal prefix di-(14) and the so-called Objective Voice (OV) (Arka & Manning, 2008) with bare, unprefixed verb (16). We did this since we primarily focus on the syntactic function of the GOODS and the TARGET in these two passive-like constructions, namely as the grammatical subject of the clause. From our Indonesian native speaker intuition, the specific case of translating the GOODS/TARGET-OBJ in the English Object Relative Clause into GOODS/TARGET-SUBJ-PASS sound natural and can be also considered equivalent for the structure of the target, Indonesian language.
As a matter of completion, the minor construction GOODS-SUBJ-PASS is also highly significantly (pHolm < 0.001) and robustly (ϕ=0.7) preserved in the Indonesian translations in 90% of the 10 cases of the construction attested in the English sample. In sum, there is an overall high degree of constructional equivalence in the Indonesian translation for the construction types of STEAL.

Conclusion
This paper has addressed one aspect of the study of equivalence, namely the degree of translation equivalence. We have integrated (i) a classic yet important idea in translation theory that equivalence is measurable in probabilistic terms, (ii) recent developments in quantitative corpus linguistic method, and (iii) Construction Grammar theory, into a study of equivalence beyond the word level, namely constructional equivalence for the verbal near-synonyms ROB and STEAL. We have discovered an overall high degree of constructional equivalence for the predominant constructional profiles of R&S in English and the way they are translated into Indonesian ( §3). More specifically, we have shown these in the (de-)profiling of the participant roles ( §3.2.1) and the construction types representing the mapping of the participant roles onto the core syntactic functions ( §3.3). Such a high degree of equivalence could be due to the translator is unconsciously confined to the form and meaning of the source language.
We have also shown that the quantitative analyses on the qualitative data allow us to measure certain norms, or typical options, in how the constructional properties of R&S are translated into Indonesian, at least based on data from movies/TV subtitles genre. Furthermore, the publicly shared, annotated database of this study may offer one example to translation researchers on how to handle the parallel concordance data and to document usage properties of the source and the target texts. Our next step from this project is to compare the constructional profile of the Indonesian translation of R&S in the translated Indonesian text vs. the constructional profile found in the original, non-translated Indonesian text. Do the constructional profiles that we found in the translated text resemble those produced NOT in the context of translation from English? The answer to this question could provide a glimpse of the naturalness of the Indonesian translation norms of R&S (cf. §3.1) in their constructional profiles, given the constructional profiles of the same verbs in the non-translated Indonesian corpus.