TFRecord를 이용한 학습 (1)

tfrecord를 이용한 학습코드를 분석해보자. 이번에는 코드의 전반적인 부분만 살펴보고 다음에 공부가 필요한 함수 몇 가지를 따로 살펴보도록 하자.

Import package

import tensorflow as tf
from tensorflow.keras.applications.densenet import preprocess_input
from tensorflow.keras import optimizers
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '3'

Load data

train_raw_image_ds = tf.data.TFRecordDataset('train/train.tfrecord')
valid_raw_image_ds = tf.data.TFRecordDataset('valid/valid.tfrecord')

미리 만들어 놓은 train.tfrecord 파일과 valid.tfrecord 파일을 가져온다. tfrecord 파일을 만들 때는 각 이미지에 해당하는 tfrecord 파일을 만든다. 예를들어 train datase이 1000개 라면 1000개의 tfrecord 파일을 생성하고, 1000개의 tfrecord 파일을 1개의 tfrecord로 합쳐주는 형식으로 작업을 했다.

Parsing and Prerocessing data

image_size = (800,800)
tratget_size = (224,224)
batch_size = 4

image_feature_description = {
        'height': tf.io.FixedLenFeature([],tf.int64),
        'width': tf.io.FixedLenFeature([],tf.int64),
        'depth': tf.io.FixedLenFeature([],tf.int64),
        'image_raw': tf.io.FixedLenFeature([], tf.string),
        'label': tf.io.FixedLenFeature([], tf.int64),
    }

def _parse_image_function(example_proto):
    features = tf.io.parse_single_example(example_proto, image_feature_description)
    image = tf.image.decode_image(features['image_raw'], channels=3)
    image = tf.cast(image, tf.float32)
    image = tf.reshape(image, [*image_size, 3])
    image.set_shape([*image_size, 3])
    image = tf.image.resize(image, [224,224],method='bicubic')

    label = tf.cast(features['label'], tf.float32)

    return image, label

def read_dataset(batch_size, dataset):
    dataset = dataset.map(_parse_image_function)
    dataset = dataset.prefetch(10)
    dataset = dataset.shuffle(buffer_size=10 * batch_size)
    dataset = dataset.batch(batch_size, drop_remainder=True)

    return dataset

image_size 는 기존에 만들어 두었던 영상의 크기이다. 예를들어 이미지 파일을 tfrecord 파일로 변환할 때, 원본크기를 (800,800) 사이즈로 일괄적으로 변형해주어 생성했다. 800이라는 숫자는 예제를 위해서 생성한 숫자이며, 일괄적으로 동일한 크기를 적용한 이유는 코드를 짜보면 알겠지만, 아마 shape에 대한 에러에 직면하게 될 것이다. 제각각인 원본이미지 크기를 그대로 저장하게 될 경우 배치로 parsing하는 과정에서 문제가 생긴다. 이부분에서 많이 헤맸는데, tensorflow의 장점은 gpu사용을 통한 배치 처리인데, 이때 배치처리 조건을 생각해보면, 데이터 셋의 형태가 동일해야 한다는 것이다. (다른방법이 있을 지는 모르겠으나 개인적으로는 그렇게 생각한다.) 그래서 위의 소스로는 각각의 데이터 셋에 대한 height, width 값을 가져올 수 없다. 그렇기 때문에 tfrecord 파일을 생성할 때, 동일한 크기로 resize 하여 저장하길 바란다.

read_dataset 을 통해 데이터를 배치화하고 _parse_image_function 로 파싱 및 전처리를 해주면 된다. 파싱조건은 tfrecord를 생성할 때의 조건을 그대로 설정해주면 된다.

Prepare data & Train

train_dataset = read_dataset(batch_size, train_raw_image_ds)
valid_dataset = read_dataset(batch_size, valid_raw_image_ds)
print(train_dataset)
print(train_dataset)

<BatchDataset shapes: ((4, 224, 224, 3), (4,)), types: (tf.float32, tf.float32)>
<BatchDataset shapes: ((4, 224, 224, 3), (4,)), types: (tf.float32, tf.float32)>

# setting options for training
initial_learning_rate = 1e-5
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate, decay_steps=20, decay_rate=0.96, staircase=True
)

checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(
    "./my_model.h5",
    monitor='val_loss',
    save_best_only=True,
    verbose=1,
    save_weights_only=False,
    mode='auto',
    period=1
)

early_stopping_cb = tf.keras.callbacks.EarlyStopping(
    patience=10,
    restore_best_weights=True,
    min_delta=0,
    verbose=1,
    mode='auto'
)

학습을 위한 옵션 설정은 각자가 원하는 대로 설정하기 바란다.

def make_model():
    base_model = tf.keras.applications.DenseNet201(
        include_top=False,
        weights="imagenet",
        input_tensor=None,
        input_shape=None,
        pooling=None
    )
    
    base_model.trainable = False
    
    inputs = tf.keras.layers.Input([*tratget_size, 3])
    x = preprocess_input(inputs)
    x = base_model(x)
    x = tf.keras.layers.GlobalAveragePooling2D()(x)
    x = tf.keras.layers.Dropout(0.425)(x)
    output = tf.keras.layers.Dense(1,activation='sigmoid')(x)
    model = tf.keras.models.Model(inputs=[inputs],outputs=[output])

    model.compile(
                  loss = 'binary_crossentropy',
                  optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule),
                  metrics =['accuracy']
    )

    return model


model = make_model()

history = model.fit(
    train_dataset,
    epochs=10,
    validation_data=valid_dataset,
    verbose=1,
    # batch_size = batch_size,
    callbacks=[checkpoint_cb, early_stopping_cb],
)

binary classification의 경우 softmax를 이용했을 때, tensor shape에 대한 에러났다. 그래서 sigmoid로 변경했고, loss도 categorical_crossentropy에서 binary_crossentropy 로 수정했다. 그리고 batch_size의 경우에는 데이터를 준비하는 단계에서 설정한 batch size 그대로 training 시 반영이 되었다. training step을 관찰 했을 때 데이터 수 / 배치크기 를 관찰 할 수 있었다. 그래서 별도로 fit() 함수에서 batch size를 설정해주지 않았다.

Train time

.tfrecord

14308/14308 [==============================] - 604s 42ms/step

.png

14308/14308 [==============================] - 1689s 117ms/step

Deep Learning Engineer

Import package

Load data

Parsing and Prerocessing data

Prepare data & Train

Train time