Give a suggestion on the revision of commonly used Chinese characters! Come on! Wait online!
1. The Significance of Formulating the Chinese Character List of General Specification
The standardization of modern Chinese characters in China was brewed in the first half of the 20th century. After the founding of New China, under the direct leadership of the State Council, this work has created a brand-new situation. Since the 1950s, the competent department of language and writing in the State Council and other relevant departments have successively issued a number of Chinese character specifications, which initially realized the standardization of Chinese characters used in society and promoted the development of education, culture and science and technology in China. In recent 20 years, with the rapid advancement of national modernization and informatization, Chinese language life has undergone tremendous changes. The breadth and depth of the connection between language and writing norms and social development and people's lives are unprecedented. In the new era and situation, the past norms can no longer fully meet the needs of modern language life. Therefore, it is very necessary to re-examine the previous Chinese character standardization work and formulate new feasible norms.
The formulation of "Chinese Character List of General Specification" is to meet the needs of national information development. At the turn of the century, China has implemented the strategy of an innovative country, and informatization has become an important support to achieve this goal. Language standardization is the basis of national informatization. Only by using standardized Chinese characters as the communication carrier can we ensure the speed and reliability of information dissemination.
The universal standardized Chinese character list is formulated to meet the needs of contemporary language life. With the development of science and technology and the improvement of education level, the scope of scientific and technological terms is constantly expanding and quickly enters daily life. The printing industry bid farewell to the "lead fire era", and laser phototypesetting has become the main means of publishing. Computers are widely used in collating ancient books and compiling dictionaries. The lack of characters and typos in computer fonts directly affect the quality and social benefits of printing, publishing and information dissemination. In social life, the information storage and retrieval of government departments and service industries (such as household registration, postal services, finance and insurance) have been digitized. The nonstandard and uncommon words in surnames and place names have affected the information system construction of many industries and the social management of the government, and also brought great inconvenience to the daily life of some related personnel. It has become a major event related to people's life and the needs of the broad masses of the people to formulate Chinese character norms that adapt to modern language life.
The Chinese Character List of General Specification is formulated to meet the needs of the national language and writing law. The General Language and Characters Law of People's Republic of China (PRC), which was promulgated in June 5438 +2000 10, stipulates that "the state promotes Putonghua and standardized Chinese characters", and further clarifies that "the state organs take Putonghua and standardized Chinese characters as official languages" and "schools and other educational institutions teach Putonghua and standardized Chinese characters through Chinese language courses. The Chinese textbooks used shall conform to the norms and standards of the national common language, and Chinese characters are the basic service words in the public service industry. The promulgation of this law has brought the standardization of Chinese characters into the legal level, and it is necessary to provide a clear model of "standardizing Chinese characters" for general social applications. Formulating a glossary, integrating and optimizing the existing Chinese character specifications, eliminating the contradiction between different standardized vocabularies, integrating scattered specifications, clarifying the policy orientation and legal effect of "specifications", and determining the extension of the legal concept of "standardized Chinese characters" at the general level, so that the "standardized Chinese characters" referred to in the National Law on Common Language and Characters can be implemented at the general level of the general society, are important measures to ensure the smooth implementation of this law.
2. Principles to be followed in formulating a general standard Chinese character list.
The formulation of the general standard Chinese character list follows the following four principles:
(1) Pay attention to the connection with the original specifications and keep the Chinese character system basically stable. The formulation of the word list not only adheres to the basic policy of simplifying Chinese characters, but also follows the principle of "be cautious about simplifying Chinese characters in the future, so that the form of Chinese characters will remain relatively stable for a period of time" as pointed out in the Notice on Abolishing the Second Plan for Simplifying Chinese Characters (Draft) and Correcting the Confusion of Chinese Characters in Society approved by 1986 the State Council. After decades of practice, the existing Chinese character specifications such as the First Batch of Variant Characters Arrangement Table, Simplified Chinese Characters Summary Table, Commonly Used Printed Chinese Characters Table, Commonly Used Modern Chinese Characters Table and Commonly Used Modern Chinese Characters Table are worth inheriting and absorbing. The word list inherits the principles and main contents of these specifications. On the basis of detailed investigation and careful analysis, the historical habits of the whole people and the degree of social acceptance were fully considered, and necessary amendments were made in line with the principle of facilitating the people and benefiting the country.
(two) adhere to the scientific spirit of seeking truth from facts, follow the structure and evolution of Chinese characters. In the process of compiling thesaurus, many senior linguists in China were invited to preside over and participate in the work, and the opinions of many other linguists and professionals in related fields were repeatedly listened to. Attention was paid to absorbing the latest achievements in the study of Chinese characters and the history of Chinese characters, and scientific statistical methods were adopted to obtain reliable data. At the same time, it also widely absorbs the practical experience of basic education, ancient books collation, dictionary compilation, printing and publishing, computer information processing and other departments, follows the evolution law of Chinese character structure, fully considers the reality of Chinese character application, and improves the scientific and feasible standardization of Chinese characters as much as possible.
(3) listen to opinions from all walks of life and take care of the demand for Chinese characters in different fields. The formulation of the character list adheres to the mass line, and the opinions of the broad masses of the people are widely listened to through various means, especially those reflected in the fields of basic education and cultural popularization, so as to meet the different requirements of people in different fields and with different educational levels for the use of Chinese characters as much as possible.
(four) give due consideration to the needs of Taiwan Province Province, Hongkong and Macao in the use of Chinese characters and internationalization. Chinese characters are popular in four places on both sides of the Taiwan Strait, and also spread across national boundaries to all parts of the world. The establishment of thesaurus should face up to the objective reality of the coexistence of simple and complex glyphs in different countries and regions where Chinese characters are used, take into account the current situation of Chinese character use and the various needs of internationalization, and try to avoid expanding the differences in Chinese character use between different countries or regions in order to facilitate mutual communication and exchange.
3. General specification of the nature of Chinese character tables
Standardization of Chinese characters has always been an important basic work in the construction of new culture in China. The List of Chinese Characters for General Norms is the supporting standard of the Law of People's Republic of China (PRC) on General Languages and Characters, and it is the main standard that reflects the national policy of using Chinese characters. Approved by the State Council, issued by the Ministry of Education and the State Language Committee.
Article 2 of the explanation of the General Specification Chinese Character List stipulates: "The General Specification Chinese Character List is a set of general specification characters for recording modern Chinese, which embodies the specification of the number, grade and font of modern general Chinese characters." The interpretation of this definition can be understood from the following three aspects:
Normalization is the primary feature and essential attribute of thesaurus. The thesaurus gives a certain number of words and word levels, which realizes the standardization of fonts and words. Three levels of character sets are specified in the table. The words and prefixes collected in each level of word set have been strictly selected and arranged according to rules, which can completely become the orthography of Chinese characters in the general field of our society and have the only standard role in fonts.
Modernity is the second characteristic of word list. The classification of thesaurus and the collection of vocabulary are based on modern applications, and the collected vocabulary is mainly selected from modern Chinese corpus, which fully considers the needs of modern language and writing life. Classical Chinese quotations in modern texts and classical Chinese in Chinese textbooks for primary and secondary schools are necessary for today's cultural inheritance and learning, and need to be printed in simplified characters, which belong to the category of modern Chinese characters and have also been included in the word list.
Universality is the third feature of thesaurus. In the field of Chinese characters used in modern society, some rare words or uncommon words do not need to be standardized. After standardization, 8300 characters are common words needed by modern society. Thesaurus is classified according to the degree of universality: the first-level thesaurus is the most common, followed by the second. Although the three-level vocabulary is taken from the professional field, these words are closely related to the information dissemination of the national economy and people's livelihood, and they are also universal in computer storage and reading.
4. How to implement the Chinese Character List of General Specification in related fields after its release?
The formulation of the general standard Chinese character list is to facilitate the people and all walks of life to better use Chinese characters. In the use of various fields, according to their own characteristics and actual needs, we can adopt the implementation method that not only follows the norms, but also has certain flexibility:
(1) Professional calligraphy such as ancient book printing, calligraphy and plaques can use traditional Chinese characters and inherited characters. In order to facilitate social interpretation and provide evidence for the printing and editing departments, the traditional Chinese characters used in the printing of ancient books should also be sorted out in time. Before the relevant standards are formulated, it is suggested to use Chinese characters that have been used in history, that is, fonts should have the basis of historical use; When multiple glyphs can be used, try to choose glyphs with large flow and easy identification; Don't invent new words or change fonts.
(2) The new Chinese curriculum standard for basic education stipulates that the literacy of primary schools is 3,500. These 3,500 words should adopt the first-class word list of the General Specification Chinese Character List. However, according to the needs of teaching, when the application subset needs to be subdivided within the range of 3,500 commonly used words, such as how many words are to be learned in the first, second and third stages, the education management department can generate an applicable application word list according to the cognitive characteristics of children of different ages, taking the necessary attributes of Chinese characters as parameters, and through scientific research. It should be emphasized that the Chinese Character List of General Specification only gives the standard font of Song Style, while the texts of grades 1-3 and the new words of all grades in our primary school Chinese textbook are in regular script, and the two fonts are slightly different in stroke shape. Before there is a new standard for regular script font, the original regular script font will still be used.
(3) Off-balance-sheet words can be used as needed. If the off-balance-sheet characters used have traditional Chinese characters, they shall not be simplified by analogy according to regulations. In order to maintain the consistency of the system, if it is really necessary to simplify the analogy, such as using words for new chemical elements, it can be reported to the national language and writing department for approval.
(4) After the publication of the thesaurus, if there are any words used for surnames that are not included, they can be reported to the national language and writing authorities, and added to the thesaurus in time after being examined and approved by experts and confirmed by relevant departments.
(5) Industrial products using Chinese characters, especially information products, should abide by the principles of Chinese character standardization and should not violate the word list. Because it takes a certain period to update products, there can be a transition period for the change of product words that do not meet the requirements of the glossary after the glossary is published. During this period, active measures should be taken to reach the national standards as soon as possible.
5. The rationale for classifying Chinese characters in the Chinese character list is generally standardized, and the number of words in the first and second Chinese character lists and the number of words received.
As a national standard, thesaurus will be open to all users. All users have different educational levels, different communication ranges and different industry needs, so the requirements for the use of Chinese characters are bound to be different. Moreover, the frequency of use of Chinese characters themselves varies greatly, and the degree of popularization is also different. Therefore, the thesaurus can only reflect the actual use of Chinese characters, take care of the needs of different groups of people and improve the practicality of the thesaurus.
There are 8300 words in the thesaurus, which are divided into three levels according to their universality. The number of words in each level of thesaurus, the standard font of each word and the scope of use of some words are clearly defined. The first-level thesaurus * * * contains 3,500 words, which is the most frequently used collection of common words in general social fields, and mainly meets the needs of basic education and cultural popularization. The second-level thesaurus * * * has 3,000 words, which are commonly used in general social fields, but the frequency of use is obviously lower than that of the first-level words; The first-and second-level vocabularies contain 6500 words, which mainly meet the needs of modern Chinese characters printing and publishing. The third-level thesaurus * * * contains 1800 words, which are commonly used words in Chinese textbooks for primary and secondary schools, such as surnames, place names, professional terms and classical Chinese that have not entered the first-level and second-level thesaurus, and mainly meet the demand for words in special fields closely related to public life.
Commonly used words are the most frequently used words, which cover most commonly used language materials and are often used in reading and writing. There must be enough common words, but the more the better. It is of great significance to accurately determine the number of commonly used words and obtain them for the transmission of Chinese character information, literacy teaching, word book writing and other applications. Theoretically, the formulation of the general standard Chinese character list is based on the important principle of "decreasing rate of Chinese character utility". Frequency reflects the specific usage of Chinese characters, and the usage frequency of a single Chinese character is not evenly distributed. With the decline of word frequency, the coverage of Chinese characters is declining. When the word frequency drops to a certain extent, the coverage rate basically does not increase. Therefore, by mastering the following data, we can roughly count the number of commonly used Chinese characters: (1) the descending list of Chinese characters detected from a general corpus; (2) the increment of the number of non-repetitive Chinese characters contained in the corpus of each frequency segment; (3) The coverage of each frequency band is increased.
In order to get these data, what kind of corpus and how many words are selected for statistics will directly affect the representativeness, objectivity and accuracy of statistical results. Formulate a general standard Chinese character list, and select the Balanced Corpus of Modern Chinese of the State Language Commission as the basic corpus. According to the principle of time and domain balance, this corpus contains 9 19 1 10,000 Chinese characters from 2009 to 2002, and does not repeat 8 18 1 Chinese characters. At the same time, beijing language and culture university's Dynamic Circulation Corpus of Modern News Media (collecting 350 million words from 1-2002 and 15 newspapers and periodicals) and the Comprehensive Corpus of Educational Science (collecting 195 1-2003) established by the research group of Chinese Characters for General Norms.
When determining the number of words received at the first and second levels, objective statistical data is always the main basis for judgment. However, due to the strong humanity of Chinese characters and their use, the boundary of Chinese character set is not clear, but there are a certain number of "critical words". Only by manually adjusting these "keywords" can we reflect the actual situation of Chinese character use more scientifically. When determining the first-level thesaurus, the following manual interventions were made: (1) 10 uppercase numerals, 22 main words, 3 1 province (municipality directly under the central government) abbreviations and some missing words of commonly used couplets were completed; (2) Increase the spoken high-frequency words in Children's Literature Corpus (5.7 million Chinese characters of various genres of children's literature corpus suitable for basic education reading published after 1949); (3) Put forward 300 words with word frequency between 320 1 and 3500, and conduct a questionnaire survey among primary and secondary school teachers, replace these words with added words, and reduce the replaced words to the secondary vocabulary. The number of words adjusted by manual intervention accounts for about 3%. When determining the second-level thesaurus, it is mainly to check and identify the words in the alternative thesaurus one by one, exclude the words in the alternative thesaurus that are far from modern meaning and physically entangled with other words, and make up for them with words with relatively high frequency in the third-level thesaurus.
Compared with the original 3500 words in Modern Chinese Common Words List, there are 103 words, but the coverage rate in different corpora is 0.09% ~ 0.22% higher than that in the original common words. Compared with 7000 words in the original Modern Chinese General Glossary, 6500 words in the first and second glossaries are reduced by 500 words, but the coverage in different corpora is basically unchanged. The reason why the number of words is reduced and the coverage rate remains unchanged is that the standardization of Chinese characters has achieved results in China for decades, and the words with poor applicability are naturally eliminated, and the social characters are relatively concentrated. At the same time, it can be seen that the word selection and quantitative data in the general specification Chinese character table are optimized, the method is scientific and the program is reasonable.
6. The reason and nature of the establishment of the three-level Chinese character table of the general specification.
The establishment of the three-level Chinese character table of the general specification is mainly to meet the Chinese character demand in special fields at the general level. Due to the increasing number of characters used in information technology, if the general standard Chinese character list is limited to basic commonly used characters, although it can meet the needs of using characters in daily life, it cannot solve the problems of computer storage and using characters in special fields. Therefore, it is necessary to add some special words closely related to public life. In addition to the basic commonly used words, the Chinese Characters List of General Specification also adds three-level words, and appropriately adds words to surnames, names, professional terms and classical Chinese in primary and secondary school textbooks, effectively solving the problem of missing words in information processing in these four fields.
Although the three-level words can't be collected into a balanced corpus according to word frequency and coverage, they still belong to the ordinary level. This problem needs to be explained from two aspects:
First of all, we need to fully understand the concept of universality. "Popularization" includes two levels: one is printing and the other is reading. Chinese characters that can enter a balanced corpus are well-known. But there are also some Chinese characters, which are not often printed, but often read. For example, medical names and professional vocabulary in nutrition are only used for printing in professional fields, but readers can reach thousands of households. For another example, a small amount of classical Chinese used in Chinese textbooks for basic education does not belong to commonly used words in society, but the textbooks are printed every year, which teachers, students and parents have to face, and the reading population has reached almost thousands of households. These words also need a unified font and input specifications.
Second, in the information age, the concept of "universality" has undergone substantial changes. In addition to human use, we should also consider the storage and use of computers. In the era when Chinese characters are mainly handwritten, there will not be much problem in the use of words in these specific fields even if the standards are uncertain. In the era of computer and Internet, those words that are difficult to find in general corpora by virtue of word frequency have not lost their universality in people's daily life. Information based on Chinese characters is popular in circulation field and necessary in reserve field. For example, names, except celebrities, will not be used too much at the general level of society; Place names, except for big cities and famous scenic spots, have limited usage at the general level of society and cannot be collected according to word frequency and usage. The ubiquitous postal services, finance, transportation and other undertakings, such as identity cards, academic certificates, medical insurance, property rights certificates and other documents, should reserve possible Chinese characters. If these words are not standardized in information processing, they will also cause confusion in the storage and use of social information. It is difficult to collect vocabulary in these specific fields, and it needs to be collected from specific vocabulary provided in specific fields. Common words in professional fields are indispensable supplements to common words in general society.
7. The principles, scope and specific sources of the three-level Chinese character table are generally standardized.
The following three principles should be adhered to when determining the word acceptance of the three-level Chinese character table in the general specification: (1) Based on the specific facts of Chinese character application, the source of words or documents is needed, or the use cases and sources are provided by relevant functional departments; (2) The sound and meaning must be complete, and the naming words should be suitable for naming; (3) To ensure its universality, we should not accept uncommon words of little use. The scope of receiving characters specifically includes: characters for surnames and first names, characters for place names, characters for technical terms, and characters for classical Chinese in primary and secondary school textbooks.
(1) Characters used for last name and first name. China is a multi-ethnic country. Surnames not only reflect the inheritance and blood relationship of ethnic groups, but also become the appellation symbols of every citizen, and their characters cannot be changed casually. In order to ensure the reliability of information dissemination, the thesaurus should collect as many words as possible for surnames. As for the Chinese characters used for names, it's quite confusing now. Some names are hard to find even in the computer international coding character set, which has expanded to more than 70 thousand words. As a result, the second-generation ID card cannot be made because of the incomplete font, which brings great inconvenience to some people's lives. The Chinese Character List of General Specification can't change the confusion and trouble in the use of existing names, but it can provide some useful words for future naming (mainly for newborns) and renaming, and can guide people to reduce the use of uncommon words and avoid naming with wrong words. Therefore, it is necessary to collect all surnames as much as possible and select enough Chinese characters suitable for naming from the existing names to ensure the effective circulation of personal names in society.
(2) Words used in place names. Chinese characters used in place names are often not universally used in the whole country, but they are commonly used by local residents. Within the scope of provincial division, it is essential to use Chinese characters for place names above township level. Due to the relationship between dialects, various "dialect words" or self-created words are often produced, resulting in confusion in the use of words. Nowadays, the information storage and retrieval of household registration, postal services, finance and other industries have been fully digitized, and the Earth satellite positioning system has been widely used. It goes without saying that the confusion of the words used in place names will bring information congestion to related industries.
(3) The use of words in technical terms. Due to the development of science and technology, the improvement of education level and the popularization of scientific knowledge, many scientific and technological terms have quickly entered people's daily life. Take Chinese characters for recording chemical elements, many of them are used in the names of drugs. When prescriptions are not written by hand but recorded by computer, these words become the carrier of communication between doctors, pharmacists, patients and their families. There are many pesticides, fertilizers and interior decoration materials that need to be promoted by scientific names. As for the use of cosmetics and detergents, the explanation of diet health, the collection and forecast of weather, etc. Once it enters the field of science popularization, it will receive universal attention. In particular, these words will be used in the compilation and printing of teaching materials for various majors.
(4) Classical Chinese in primary and secondary school textbooks. Standardized Chinese characters are mainly written in modern Chinese texts, but tradition, history and modernity are not isolated. Modern Chinese will quote classical Chinese works, and Chinese textbooks for primary and secondary schools will include some excellent classical Chinese works. The former can be collected in a general balanced corpus, but in order to ensure the standardization of textbook printing, it is necessary to collect words in classical Chinese in Chinese textbooks.
These four aspects are not only the important fields of using Chinese characters in the information age, but also the fields that are easy to lack Chinese characters in information dissemination. The specific number of words received is as follows:
There are 930 surnames and first names, which are mainly derived from the sampling survey data of 1982 and 18 provinces and cities, as well as some surnames and first names provided by the Ministry of Public Security, and some ancient surnames and influential ancient names are supplemented appropriately.
There are 465 characters for place names, mainly from the characters for place names above towns provided by the Ministry of Civil Affairs, some village names and some natural entity names provided by the State Bureau of Surveying and Mapping, and characters marked as "place names" in commonly used Chinese tool books.
There are 276 words of scientific and technological terms, which mainly come from 56 categories of traditional Chinese medicine, botany, genetics, metallurgy, microbiology and soil science provided by the National Committee for the Examination and Approval of Scientific and Technological Terms, and 33 categories of science and technology, humanities and social sciences provided by the Institute of Linguistics of the Chinese Academy of Social Sciences.
There are 362 characters in classical Chinese textbooks for primary and secondary schools, which are mainly extracted from the "Corpus of Classical Chinese Textbooks for Primary and Secondary Schools" established by the Word List Development Working Group (from 1949 to 2007, 5.6 million Chinese characters were collected from the corpus of Chinese and Classical Chinese popularization for primary and secondary schools).
After the words in the above four aspects are merged and copied, the words that have entered the first and second level vocabulary are removed, and then the difficult words, typos and variant characters are removed, and * * * is 1800 words. Because three-level words are obtained by merging four phrases and eliminating repetition, the attributes of three-level words are not all single, and some words may have different attributes in many fields.
8. Handling of Variant Characters in the Chinese Character List of General Specification
Strict variants should be defined as a group of words with the same sound and meaning, the same function of remembering words, but different shapes, which can be replaced by each other in any context and do not affect the expression of meaning. It can be seen that from the functional point of view, variant characters are the redundancy of Chinese characters, which only increases the burden of memory and needs to be standardized. 195565438+on February 22, 2005, the explanation of the first batch of variant forms pointed out: "Since the date of implementation, newspapers, magazines and books published nationwide have stopped using variant forms in brackets. But if you need to reprint ancient books in the original words, you can make an exception. " This explanation makes it clear that variant characters belong to the scope of "nonstandard characters" and cannot be used when writing modern Chinese texts at a general level. However, some "variants" identified in the variant list are not variants in the strict sense. Putting all these words into the scope of "nonstandard words" and canceling them is sometimes not conducive to accurately expressing the meaning.
For details, please refer to China Language and Literature Website (website:).