กระบวนการสกัดข้อมูลรายงานอุบัติเหตุทางถนนรายใหญ่และความสามารถในการนำเสนอสารสนเทศด้วยภาพข้อมูลผ่านเว็บไซต์ (DATA EXTRACTION PROCESS OF ROAD MAJOR ACCIDENT REPORTS AND USABILITY TESTING OF INFORMATION PRESENTATION BY DATA VISUALIZATION ON WEBSITE)

จักรินทร์ สันติรัตนภักดี

Authors

จักรินทร์ สันติรัตนภักดี สาขาวิชาระบบสารสนเทศคอมพิวเตอร์ คณะบริหารธุรกิจ มหาวิทยาลัยวงษ์ชวลิตกุล

Abstract

ประเทศไทยมีการจัดเก็บข้อมูลการเกิดอุบัติเหตุเพื่อนำมาวางแผนความปลอดภัยทางถนน โดยรายงานอุบัติเหตุรายใหญ่ถูกเผยแพร่ในวงจำกัด เนื่องจากต้องสังเคราะห์เนื้อหาด้วยตนเอง ข้อมูลบางส่วนจึงยังมีความผิดพลาด และไม่เป็นมาตรฐานจากผู้ให้ข้อมูลที่แตกต่างกัน ดังนั้นงานวิจัยชิ้นนี้มีวัตถุประสงค์เพื่อ 1) ออกแบบและพัฒนากระบวนการสกัดข้อมูลรายงานอุบัติเหตุทางถนนรายใหญ่ รวบรวมข้อมูลบนเว็บไซต์ด้วยเทคนิคการขูดเว็บมาสกัดข้อมูลใน 6 ประเด็น ได้แก่ จังหวัด วันที่ จำนวนผู้เสียชีวิต จำนวนผู้บาดเจ็บ ประเภทและจำนวนของยานพาหนะที่เกิดเหตุ นำเสนอเป็นภาพข้อมูลแก่ผู้ใช้เพื่อเป็นส่วนหนึ่งในการวางแผนความปลอดภัยทางถนนเชิงพื้นที่หรือตามช่วงเวลา ผลการประเมินความถูกต้องจากกระบวนการสกัดชื่อจังหวัด วันที่ และประเภทของยานพาหนะจากการสร้างคลังคำศัพท์ร่วมกับการรู้จำเอนทิตี้มีความถูกต้องสูง ขึ้นอยู่กับครอบคลุมของคลังคำศัพท์ แต่อย่างไรก็ดี จำนวนผู้เสียชีวิต จำนวนผู้บาดเจ็บ และจำนวนยานพาหนะที่กำหนดรูปแบบอักขระร่วมกับการรู้จำเอนทิตี้ นับว่ามีความถูกต้องค่อนข้างต่ำ เมื่อเทียบกับการสร้างคลังคำศัพท์ร่วมกับการรู้จำเอนทิตี้ เนื่องจากรูปแบบการบันทึกข้อมูลที่ไม่เป็นมาตรฐาน 2) ประเมินความสามารถในการนำเสนอสารสนเทศรายงานอุบัติเหตุทางถนนรายใหญ่ด้วยภาพข้อมูลผ่านเว็บไซต์ จากผู้ใช้ 30 คน ด้วยการสุ่มตัวอย่างโดยไม่ใช้ความน่าจะเป็นจากการเลือกกลุ่มตัวอย่างแบบเฉพาะเจาะจง โดยให้ผู้ใช้แต่ละคนทดลองใช้งานแล้วประเมินผลผ่านแบบสอบถาม คนละ 3 ครั้ง แล้วนำมาหาค่าเฉลี่ย ผลการประเมินภาพรวมอยู่ในระดับดี เมื่อแบ่งผู้ใช้เป็น 3 กลุ่ม ตามบริบทการใช้งาน ได้แก่ กลุ่มผู้เชี่ยวชาญด้านคอมพิวเตอร์ กลุ่มผู้ทำหน้าที่ข้อมูลจราจร และกลุ่มผู้ใช้ทั่วไป แล้ววิเคราะห์ด้วยสถิติทดสอบเอฟ แบบ LSD พบว่า กลุ่มผู้เชี่ยวชาญด้านคอมพิวเตอร์ และกลุ่มผู้ทำหน้าที่ข้อมูลจราจร มีความแตกต่างด้านการเรียนรู้อย่างมีนัยสำคัญทางสถิติที่ระดับ 0.05 แสดงถึงประสบการณ์ในการใช้งานระบบด้านการจราจรที่เกี่ยวข้องส่งผลต่อการเรียนรู้มากกว่าประสบการณ์ใช้งานคอมพิวเตอร์และสมาร์ตโฟน ที่แม้จะมีประสบการณ์ในการใช้งานคอมพิวเตอร์และสมาร์ตโฟนมากกว่า แต่ผลการประเมินด้านความสามารถในการเรียนรู้ต่ำกว่า อันเป็นแนวทางออกแบบและพัฒนาการสกัดข้อมูลกึ่งมีโครงสร้าง และนำเสนอสารสนเทศด้วยภาพข้อมูลในอนาคต คำสำคัญ: การสกัดข้อมูล อุบัติเหตุทางถนน ความสามารถในการใช้งาน การนำเสนอภาพข้อมูล Thailand has an accident database to be used for road safety planning. The major accident report is one of the qualitative data waiting to be utilized, because users have to synthesize the content themselves. Therefore published only in a limited. However, some of the information still contains errors and is not standardized from the context of different informants. The research aims 1) to design and develop of a data extraction process. By accessing and collecting data on the website using web scraping techniques for extract data in 6 issues: province, date, number of deaths, number of injured, vehicle type and number of vehicles involved in the accident for presented to user on data visualization and use in the part of road safety planning at the local level in area or periodic. The results of accuracy evaluation from data extraction process in the issue: “province name”, “date” and “vehicle type” by corpus-based approach and Named Entity Recognition (NER) method is highly accurate, depend on completely of corpus-based. However, the accuracy of the “number of deaths”, “number of injured” and “number of vehicles involved in the accident” by set regular expression for NER method is relatively low compared to corpus-based approach and NER method. Due to a non-standard report format. 2) to usability testing of information presentation by data visualization on website from 30 users determine the sample size on non-probability sampling with purposive sampling, by having each user try it out and evaluate the results through questionnaires 3 times. The usability testing evaluation is good level. When the user was divided into 3 groups according to the context of use: computer user group, traffic information group and general user groups for F-test analysis with LSD found statistically significant difference between the computer experts group and traffic information groups at the 0.05 level in Learnability component. Demonstrate the experience of using relevant traffic systems, which affects the user experience more than the experience of computers and smartphones. At a computer expert group, despite having more using computers and smartphones experience, but the assessment of learnability component is lower than. Which is the design and development of semi-structured data extraction and information presentation by data visualization in the future. Keywords: Data Extraction, Road Accidents, Usability Testing, Data Visualization

Downloads

Download data is not yet available.

References

World Health Organization. (2018). Global Status Report on Road Safety 2018. Geneva: World Health Organization.

Division of Non Communicable Disease. (2015). Where Did The Data and Statistics of The Dead Disappear?. Nonthaburi: Bureau of Non Communicable Disease.

Road Safety Center. (2017). Accident Data Definition. Bangkok: Department of Disaster Prevention and Mitigation, Ministry of Interior.

Hércules Antonio do Prado, and Edilson Ferneda. (2008). Emerging Technologies of Text Mining: Techniques and Applications. New York: Hershey.

Nitin Indurkhya, and Fred J. Damerau. (2010). Handbook of Natural Language Processing. 2nd ed. Cambridge: Chapman & Hall/CRC.

Fu, Guohong, and Luke, Kang-Kwong. (2005, June). Chinese Named Entity Recognition Using Lexicalized HMMs. ACM SIGKDD Explorations Newsletter, 7(1), 19-25.

Tongtep, Nattapong, and Theeramunkong, Thanaruk. (2010, January-March). Pattern-Based Extraction of Named Entities in Thai News Documents. Thammasat International Journal of Science and Technology, 15(1), 70-81.

Liu, Xiaohua; Wei, Furu; Zhang, Shaodian, and Zhou, Ming. (2013, Febuary). Named Entity Recognition for Tweets. ACM Transactions on Intelligent Systems and Technology, 4(1), 1-15.

Kanwal, Safia; Malik, Kamran; Shahzad, Khurram; Aslam, Faisal, and Nawaz, Zubair. (2019, June). Urdu Named Entity Recognition: Corpus Generation and Deep Learning Applications. ACM Transactions on Asian and Low-Resource Language Information Processing, 19(1), 1-13.

Wangtragulsang, Chinorot; Phaphoom, Nattakarn; Na Lamphun, Phannachet; Charnkeitkong, Pisit, and Qu, Jian. (2019, July-December). Thai Celebrity Information Extraction Based on Association Rule Measures. International Scientific Journal of Engineering and Technology (ISJET), 3(2), 42-50.

Chantaraj, Pongkorn, and Rungrattanaubol, Jaratsri. (2020, January-June). Applied Information Extraction Technique for Extracting the king Name Who Build a Temple in Lanna Kingdom from Historical Documents. Information Technology Journal, 16(1), 24-33.

Chantrapornchai, Chantana, and Tunsakul, Aphisit. (2021, April). Information Extraction Tasks Based on BERT and SpaCy on Tourism Domain. ECTI Transactions on Computer and Information Technology, 15(1), 108-122.

Ian Sommerville. (2010). Software Engineering. 9th ed. Boston: Addison-Wesley.

Alan Dennis, Barbara Haley Wixom, and Roberta M. Roth. (2012). System Analysis and Design. 5th ed. New York: John Wiley & Sons, Inc.

Phil Dutson. (2014). Responsive Mobile Design Designing for Every Device. Indiana: Addison-Wesley.

Ryan Mitchell. (2015). Web Scraping with Python. 2nd ed. Sebastopol: O'Reilly Media, Inc.

Simon Munzert, Christian Rubba, Peter Meißner, and Dominic Nyhuis. (2015). Automated Data Collection with RA Practical Guide to Web Scraping and Text Mining. New York: John Wiley & Sons, Inc.

Khalil, Salim, and Fakir, Mohamed. (2017, November). RCrowler: An R Package for Parallel Web Crawling and Scraping. SoftwareX, 6, 98-106.

Office of Transport and Traffic Policy and Planning. (2019). Ministry of Transport's Road Accident Situation Analysis Report 2018. Bangkok: Ministry of Transport.

Jorge Martinez. (2018). Google Charts for Institutional Research Websites. Houston: University of Houston.

Chun-houh Chen, Wolfgang Härdle, and Antony Unwin. (2008). Handbook of Data Visualization. Berlin: Springer-Verlag.

Claus O. Wilke. (2019). Fundamentals of Data Visualization A Primer on Making Informative and Compelling Figures. Sebastopol: O’Reilly Media, Inc.

The International Organization for Standardization (ISO). (1998). Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs) Part 11: Guidance on Usability. Genève: ISO 1998.

Brian Shackel, and Simon Richardson. (1991). Human Factors for Informatics Usability – Background and Overview. Cambridge: Cambridge University Press.

Tom Brinck, Darren Gergle, and Scott D. Wood. (2012). Usability for the Web: Designing Web Sites That Work, San Francisco: Morgan Kaufmann.

Jennifer Preece. (2002). Interaction Design: Beyond Human-Computer Interaction, New York: John Wiley & Sons, Inc.

Alan Dix, Janet E. Finlay, Gregory D. Abowd, and Russell Beale. (2004). Human-Computer Interaction. 3rd ed. New Jersey: Pearson, Upper Saddle River.

Ben Shneiderman, and Catherine Plaisant. (2005). Designing the User Interface: Strategies for Effective Human-Computer Interaction. 4th ed. Massachusetts: Addison Wesley Longman.

Jeff Rubin, and Dana Chisnell. (2008). Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests. Indiana: Wiley.

Jacob Nielsen. (1993). Usability Engineering. New York: Morgan Kaufmann.

Folmer, Eelke, and Bosch, Jan. (2004, February). Architecting for Usability: A Survey. The Journal of Systems and Software, 70(1-2), 61-78.

Rensis Likert. (1961). New Patterns of Management. New York: McGraw-Hill Book Company Inc.

W. Lawrence Neuman. (2012). Basics of Social Research: Qualitative & Quantitative Approaches. 3rd ed. Harlow: Pearson Education Limited.

Nielsen, Jakob, and Landauer, Tomas K. (1993). A Mathematical Model of the Finding of Usability Problems. In the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems. pp. 206-213. Amsterdam: Association for Computing Machinery.

Boonchom Srisa-ard. (2011). Basic Research. Bangkok: Suveiriyasan Publishing.

Sukamolson, Suphat. (2017, July-December). Priori and Posteriori Comparisons for a Research Study. Academic Journal of Buriram Rajabhat University, 9(2), 51-70.

Wannaratn, Wirach. (2017, October 2016 - January 2017). Test Score and Grading. Journal of Humanities and Social Sciences, Rajapruk University, 2(3), 1-11.

Pasunon, Prasopchai. (2014, April - September). Reliability of Questionnaire in Quantitative Research. Parichart Journal, 27(1), 145-163.

Santirattanaphakdi, Chakkarin. (2018, January-December). Online Marketing and Customer Service by Chatbot Case Study: Chatfuel in Customer Interactive on Messenger. Sripatum Review of Science and Technology, 10, 71-87.

Chanwimalueng, Waiwit, and Polnigongit, Weerapong. (2018, January-June). A Study of Usability of Elderly Upon Button Size and Shape on Smartphone for Creating Fuzzy Logic Model. Srinakharinwirot University (Journal of Science and Technology), 10(19), 121-135.

Rex Hartson, and Pardha S. Pyla. (2012). The UX BookProcess and Guidelines for Ensuringa Quality User Experience. Waltham: Elsevier, Inc.

Jeremy J. Sydik. (2007). Design Accessible Web Sites Thirty-Six Keys to Creating Content for All Audiences and Platform. Texas: The Pragmatic Bookshelf.

Authors

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

Current Issue