TY - CHAP
T1 - Repairing DNN Architecture
T2 - 16th IEEE International Conference on Software Testing, Verification and Validation, ICST 2023
AU - Kim, Jinhan
AU - Humbatova, Nargiz
AU - Jahangirova, Gunel
AU - Tonella, Paolo
AU - Yoo, Shin
N1 - Funding Information:
Jinhan Kim and Shin Yoo have been supported by the Engineering Research Center Program through the National Research Foundation of Korea (NRF) funded by the Korean Government (MSIT) (NRF-2018R1A5A1059921), NRF Grant (NRF-2020R1A2C1013629), Institute for Information & communications Technology Promotion grant funded by the Korean government (MSIT) (No.2021-0-01001), and Samsung Electronics (Grant No. IO201210-07969-01). This work was partially supported by the H2020 project PRECRIME, funded under the ERC Advanced Grant 2017 Program (ERC Grant Agreement n. 787703).
Publisher Copyright:
© 2023 IEEE.
PY - 2023/5/26
Y1 - 2023/5/26
N2 - As Deep Neural Networks (DNNs) are rapidly being adopted within large software systems, software developers are increasingly required to design, train, and deploy such models into the systems they develop. Consequently, testing and improving the robustness of these models have received a lot of attention lately. However, relatively little effort has been made to address the difficulties developers experience when designing and training such models: if the evaluation of a model shows poor performance after the initial training, what should the developer change? We survey and evaluate existing state-of-the-art techniques that can be used to repair model performance, using a benchmark of both real-world mistakes developers made while designing DNN models and artificial faulty models generated by mutating the model code. The empirical evaluation shows that random baseline is comparable with or sometimes outperforms existing state-of-the-art techniques. However, for larger and more complicated models, all repair techniques fail to find fixes. Our findings call for further research to develop more sophisticated techniques for Deep Learning repair.
AB - As Deep Neural Networks (DNNs) are rapidly being adopted within large software systems, software developers are increasingly required to design, train, and deploy such models into the systems they develop. Consequently, testing and improving the robustness of these models have received a lot of attention lately. However, relatively little effort has been made to address the difficulties developers experience when designing and training such models: if the evaluation of a model shows poor performance after the initial training, what should the developer change? We survey and evaluate existing state-of-the-art techniques that can be used to repair model performance, using a benchmark of both real-world mistakes developers made while designing DNN models and artificial faulty models generated by mutating the model code. The empirical evaluation shows that random baseline is comparable with or sometimes outperforms existing state-of-the-art techniques. However, for larger and more complicated models, all repair techniques fail to find fixes. Our findings call for further research to develop more sophisticated techniques for Deep Learning repair.
KW - deep learning
KW - hyperparameter tuning
KW - program repair
KW - real faults
UR - http://www.scopus.com/inward/record.url?scp=85161801973&partnerID=8YFLogxK
U2 - 10.1109/ICST57152.2023.00030
DO - 10.1109/ICST57152.2023.00030
M3 - Conference paper
AN - SCOPUS:85161801973
SN - 9781665456678
T3 - International Conference on Software Testing, Verification, and Validation (ICST)
SP - 234
EP - 245
BT - 2023 IEEE Conference on Software Testing, Verification and Validation (ICST)
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 April 2023 through 20 April 2023
ER -