Multivariate binary sequence prediction with CRF












6















this question is an extension of this one which focuses on LSTM as opposed to CRF. Unfortunately, I do not have any experience with CRFs, which is why I'm asking these questions.



Problem:



I would like to predict a sequence of binary signal for multiple, non-independent groups. My dataset is moderately small (~1000 records per group), so I would like to try a CRF model here.



Available data:



I have a dataset with the following variables:




  1. Timestamps

  2. Group

  3. Binary signal representing activity


Using this dataset I would like to forecast group_a_activity and group_b_activity which are both 0 or 1.



Note that the groups are believed to be cross-correlated and additional features can be extracted from timestamps -- for simplicity we can assume that there is only 1 feature we extract from the timestamps.



What I have so far:



Here is the data setup that you can reproduce on your own machine.



# libraries
import re
import numpy as np
import pandas as pd

data_length = 18 # how long our data series will be
shift_length = 3 # how long of a sequence do we want

df = (pd.DataFrame # create a sample dataframe
.from_records(np.random.randint(2, size=[data_length, 3]))
.rename(columns={0:'a', 1:'b', 2:'extra'}))
df.head() # check it out

# shift (assuming data is sorted already)
colrange = df.columns
shift_range = [_ for _ in range(-shift_length, shift_length+1) if _ != 0]
for c in colrange:
for s in shift_range:
if not (c == 'extra' and s > 0):
charge = 'next' if s > 0 else 'last' # 'next' variables is what we want to predict
formatted_s = '{0:02d}'.format(abs(s))
new_var = '{var}_{charge}_{n}'.format(var=c, charge=charge, n=formatted_s)
df[new_var] = df[c].shift(s)

# drop unnecessary variables and trim missings generated by the shift operation
df.dropna(axis=0, inplace=True)
df.drop(colrange, axis=1, inplace=True)
df = df.astype(int)
df.head() # check it out

# a_last_03 a_last_02 ... extra_last_02 extra_last_01
# 3 0 1 ... 0 1
# 4 1 0 ... 0 0
# 5 0 1 ... 1 0
# 6 0 0 ... 0 1
# 7 0 0 ... 1 0
[5 rows x 15 columns]


Before we get to the CRF part, I suspect that I cannot use approach this problem from a multi-task learning point of view (predicting patterns for both A and B via one model) and therefore I'm going to have to predict each of them individually.



Now the CRF part. I've found some relevant example (here is one) but they all tend to predict a single class value based on a prior sequence.



Here is my attempt at using a CRF here:



import pycrfsuite

crf_features = # a container for features
crf_labels = # a container for response
# lets focus on group A only for this one
current_response = [c for c in df.columns if c.startswith('a_next')]
# predictors are going to have to be nested otherwise I'll run into problems with dimensions
current_predictors = [c for c in df.columns if not 'next' in c]
current_predictors = set([re.sub('_d+$','',v) for v in current_predictors])
for index, row in df.iterrows():
# not sure if its an effective way to iterate over a DF...
iter_features =
for p in current_predictors:
pred_feature =
# note that 0/1 values have to be converted into booleans
for k in range(shift_length):
iter_pred_feature = p + '_{0:02d}'.format(k+1)
pred_feature.append(p + "=" + str(bool(row[iter_pred_feature])))
iter_features.append(pred_feature)
iter_response = [row[current_response].apply(lambda z: str(bool(z))).tolist()]
crf_labels.extend(iter_response)
crf_features.append(iter_features)

trainer = pycrfsuite.Trainer(verbose=True)
for xseq, yseq in zip(crf_features, crf_labels):
trainer.append(xseq, yseq)

trainer.set_params({
'c1': 0.0, # coefficient for L1 penalty
'c2': 0.0, # coefficient for L2 penalty
'max_iterations': 10, # stop earlier
# include transitions that are possible, but not observed
'feature.possible_transitions': True
})

trainer.train('testcrf.crfsuite')
tagger = pycrfsuite.Tagger()
tagger.open('testcrf.crfsuite')
tagger.tag(xseq)
# ['False', 'True', 'False']


It seems that I did manage to get it working, but I'm not sure if I've approached it correctly. I'll formulate my questions in the Questions section, but first, here is an alternative approach using keras_contrib package:



from keras import Sequential
from keras_contrib.layers import CRF
from keras_contrib.losses import crf_loss

# we are gonna have to revisit data prep stage again
# separate predictors and response
response_df_dict = {}
for g in ['a','b']:
response_df_dict[g] = df[[c for c in df.columns if 'next' in c and g in c]]

# reformat for LSTM
# the response for every row is a matrix with depth of 2 (the number of groups) and width = shift_length
# the predictors are of the same dimensions except the depth is not 2 but the number of predictors that we have

response_array_list =
col_prefix = set([re.sub('_d+$','',c) for c in df.columns if 'next' not in c])
for c in col_prefix:
current_array = df[[z for z in df.columns if z.startswith(c)]].values
response_array_list.append(current_array)

# reshape into samples (1), time stamps (2) and channels/variables (0)
response_array = np.array([response_df_dict['a'].values,response_df_dict['b'].values])
response_array = np.reshape(response_array, (response_array.shape[1], response_array.shape[2], response_array.shape[0]))
predictor_array = np.array(response_array_list)
predictor_array = np.reshape(predictor_array, (predictor_array.shape[1], predictor_array.shape[2], predictor_array.shape[0]))

model = Sequential()
model.add(CRF(2, input_shape=(predictor_array.shape[1],predictor_array.shape[2])))
model.summary()
model.compile(loss=crf_loss, optimizer='adam', metrics=['accuracy'])
model.fit(predictor_array, response_array, epochs=10, batch_size=1)
model_preds = model.predict(predictor_array) # not gonna worry about train/test split here


Questions:



My main question is whether or not I've constructed both of my CRF models correctly. What worries me is that (1) there is not a lot of documentation out there on CRF models, (2) CRFs are mainly used for predicting a single label given a sequence, (3) the input features are nested and (4) when used in a multi-tasked fashion, I'm not sure if it is valid.



I have a few extra questions as well:




  1. Is a CRF appropriate for this problem?

  2. How are the 2 approaches (one based on pycrfuite and one based on keras_contrib) different and what are their advantages/disadvantages?

  3. In a more general sense, what is the advantage of combining CRF and LSTM models into one (like one discussed here)


Many thanks!










share|improve this question





























    6















    this question is an extension of this one which focuses on LSTM as opposed to CRF. Unfortunately, I do not have any experience with CRFs, which is why I'm asking these questions.



    Problem:



    I would like to predict a sequence of binary signal for multiple, non-independent groups. My dataset is moderately small (~1000 records per group), so I would like to try a CRF model here.



    Available data:



    I have a dataset with the following variables:




    1. Timestamps

    2. Group

    3. Binary signal representing activity


    Using this dataset I would like to forecast group_a_activity and group_b_activity which are both 0 or 1.



    Note that the groups are believed to be cross-correlated and additional features can be extracted from timestamps -- for simplicity we can assume that there is only 1 feature we extract from the timestamps.



    What I have so far:



    Here is the data setup that you can reproduce on your own machine.



    # libraries
    import re
    import numpy as np
    import pandas as pd

    data_length = 18 # how long our data series will be
    shift_length = 3 # how long of a sequence do we want

    df = (pd.DataFrame # create a sample dataframe
    .from_records(np.random.randint(2, size=[data_length, 3]))
    .rename(columns={0:'a', 1:'b', 2:'extra'}))
    df.head() # check it out

    # shift (assuming data is sorted already)
    colrange = df.columns
    shift_range = [_ for _ in range(-shift_length, shift_length+1) if _ != 0]
    for c in colrange:
    for s in shift_range:
    if not (c == 'extra' and s > 0):
    charge = 'next' if s > 0 else 'last' # 'next' variables is what we want to predict
    formatted_s = '{0:02d}'.format(abs(s))
    new_var = '{var}_{charge}_{n}'.format(var=c, charge=charge, n=formatted_s)
    df[new_var] = df[c].shift(s)

    # drop unnecessary variables and trim missings generated by the shift operation
    df.dropna(axis=0, inplace=True)
    df.drop(colrange, axis=1, inplace=True)
    df = df.astype(int)
    df.head() # check it out

    # a_last_03 a_last_02 ... extra_last_02 extra_last_01
    # 3 0 1 ... 0 1
    # 4 1 0 ... 0 0
    # 5 0 1 ... 1 0
    # 6 0 0 ... 0 1
    # 7 0 0 ... 1 0
    [5 rows x 15 columns]


    Before we get to the CRF part, I suspect that I cannot use approach this problem from a multi-task learning point of view (predicting patterns for both A and B via one model) and therefore I'm going to have to predict each of them individually.



    Now the CRF part. I've found some relevant example (here is one) but they all tend to predict a single class value based on a prior sequence.



    Here is my attempt at using a CRF here:



    import pycrfsuite

    crf_features = # a container for features
    crf_labels = # a container for response
    # lets focus on group A only for this one
    current_response = [c for c in df.columns if c.startswith('a_next')]
    # predictors are going to have to be nested otherwise I'll run into problems with dimensions
    current_predictors = [c for c in df.columns if not 'next' in c]
    current_predictors = set([re.sub('_d+$','',v) for v in current_predictors])
    for index, row in df.iterrows():
    # not sure if its an effective way to iterate over a DF...
    iter_features =
    for p in current_predictors:
    pred_feature =
    # note that 0/1 values have to be converted into booleans
    for k in range(shift_length):
    iter_pred_feature = p + '_{0:02d}'.format(k+1)
    pred_feature.append(p + "=" + str(bool(row[iter_pred_feature])))
    iter_features.append(pred_feature)
    iter_response = [row[current_response].apply(lambda z: str(bool(z))).tolist()]
    crf_labels.extend(iter_response)
    crf_features.append(iter_features)

    trainer = pycrfsuite.Trainer(verbose=True)
    for xseq, yseq in zip(crf_features, crf_labels):
    trainer.append(xseq, yseq)

    trainer.set_params({
    'c1': 0.0, # coefficient for L1 penalty
    'c2': 0.0, # coefficient for L2 penalty
    'max_iterations': 10, # stop earlier
    # include transitions that are possible, but not observed
    'feature.possible_transitions': True
    })

    trainer.train('testcrf.crfsuite')
    tagger = pycrfsuite.Tagger()
    tagger.open('testcrf.crfsuite')
    tagger.tag(xseq)
    # ['False', 'True', 'False']


    It seems that I did manage to get it working, but I'm not sure if I've approached it correctly. I'll formulate my questions in the Questions section, but first, here is an alternative approach using keras_contrib package:



    from keras import Sequential
    from keras_contrib.layers import CRF
    from keras_contrib.losses import crf_loss

    # we are gonna have to revisit data prep stage again
    # separate predictors and response
    response_df_dict = {}
    for g in ['a','b']:
    response_df_dict[g] = df[[c for c in df.columns if 'next' in c and g in c]]

    # reformat for LSTM
    # the response for every row is a matrix with depth of 2 (the number of groups) and width = shift_length
    # the predictors are of the same dimensions except the depth is not 2 but the number of predictors that we have

    response_array_list =
    col_prefix = set([re.sub('_d+$','',c) for c in df.columns if 'next' not in c])
    for c in col_prefix:
    current_array = df[[z for z in df.columns if z.startswith(c)]].values
    response_array_list.append(current_array)

    # reshape into samples (1), time stamps (2) and channels/variables (0)
    response_array = np.array([response_df_dict['a'].values,response_df_dict['b'].values])
    response_array = np.reshape(response_array, (response_array.shape[1], response_array.shape[2], response_array.shape[0]))
    predictor_array = np.array(response_array_list)
    predictor_array = np.reshape(predictor_array, (predictor_array.shape[1], predictor_array.shape[2], predictor_array.shape[0]))

    model = Sequential()
    model.add(CRF(2, input_shape=(predictor_array.shape[1],predictor_array.shape[2])))
    model.summary()
    model.compile(loss=crf_loss, optimizer='adam', metrics=['accuracy'])
    model.fit(predictor_array, response_array, epochs=10, batch_size=1)
    model_preds = model.predict(predictor_array) # not gonna worry about train/test split here


    Questions:



    My main question is whether or not I've constructed both of my CRF models correctly. What worries me is that (1) there is not a lot of documentation out there on CRF models, (2) CRFs are mainly used for predicting a single label given a sequence, (3) the input features are nested and (4) when used in a multi-tasked fashion, I'm not sure if it is valid.



    I have a few extra questions as well:




    1. Is a CRF appropriate for this problem?

    2. How are the 2 approaches (one based on pycrfuite and one based on keras_contrib) different and what are their advantages/disadvantages?

    3. In a more general sense, what is the advantage of combining CRF and LSTM models into one (like one discussed here)


    Many thanks!










    share|improve this question



























      6












      6








      6








      this question is an extension of this one which focuses on LSTM as opposed to CRF. Unfortunately, I do not have any experience with CRFs, which is why I'm asking these questions.



      Problem:



      I would like to predict a sequence of binary signal for multiple, non-independent groups. My dataset is moderately small (~1000 records per group), so I would like to try a CRF model here.



      Available data:



      I have a dataset with the following variables:




      1. Timestamps

      2. Group

      3. Binary signal representing activity


      Using this dataset I would like to forecast group_a_activity and group_b_activity which are both 0 or 1.



      Note that the groups are believed to be cross-correlated and additional features can be extracted from timestamps -- for simplicity we can assume that there is only 1 feature we extract from the timestamps.



      What I have so far:



      Here is the data setup that you can reproduce on your own machine.



      # libraries
      import re
      import numpy as np
      import pandas as pd

      data_length = 18 # how long our data series will be
      shift_length = 3 # how long of a sequence do we want

      df = (pd.DataFrame # create a sample dataframe
      .from_records(np.random.randint(2, size=[data_length, 3]))
      .rename(columns={0:'a', 1:'b', 2:'extra'}))
      df.head() # check it out

      # shift (assuming data is sorted already)
      colrange = df.columns
      shift_range = [_ for _ in range(-shift_length, shift_length+1) if _ != 0]
      for c in colrange:
      for s in shift_range:
      if not (c == 'extra' and s > 0):
      charge = 'next' if s > 0 else 'last' # 'next' variables is what we want to predict
      formatted_s = '{0:02d}'.format(abs(s))
      new_var = '{var}_{charge}_{n}'.format(var=c, charge=charge, n=formatted_s)
      df[new_var] = df[c].shift(s)

      # drop unnecessary variables and trim missings generated by the shift operation
      df.dropna(axis=0, inplace=True)
      df.drop(colrange, axis=1, inplace=True)
      df = df.astype(int)
      df.head() # check it out

      # a_last_03 a_last_02 ... extra_last_02 extra_last_01
      # 3 0 1 ... 0 1
      # 4 1 0 ... 0 0
      # 5 0 1 ... 1 0
      # 6 0 0 ... 0 1
      # 7 0 0 ... 1 0
      [5 rows x 15 columns]


      Before we get to the CRF part, I suspect that I cannot use approach this problem from a multi-task learning point of view (predicting patterns for both A and B via one model) and therefore I'm going to have to predict each of them individually.



      Now the CRF part. I've found some relevant example (here is one) but they all tend to predict a single class value based on a prior sequence.



      Here is my attempt at using a CRF here:



      import pycrfsuite

      crf_features = # a container for features
      crf_labels = # a container for response
      # lets focus on group A only for this one
      current_response = [c for c in df.columns if c.startswith('a_next')]
      # predictors are going to have to be nested otherwise I'll run into problems with dimensions
      current_predictors = [c for c in df.columns if not 'next' in c]
      current_predictors = set([re.sub('_d+$','',v) for v in current_predictors])
      for index, row in df.iterrows():
      # not sure if its an effective way to iterate over a DF...
      iter_features =
      for p in current_predictors:
      pred_feature =
      # note that 0/1 values have to be converted into booleans
      for k in range(shift_length):
      iter_pred_feature = p + '_{0:02d}'.format(k+1)
      pred_feature.append(p + "=" + str(bool(row[iter_pred_feature])))
      iter_features.append(pred_feature)
      iter_response = [row[current_response].apply(lambda z: str(bool(z))).tolist()]
      crf_labels.extend(iter_response)
      crf_features.append(iter_features)

      trainer = pycrfsuite.Trainer(verbose=True)
      for xseq, yseq in zip(crf_features, crf_labels):
      trainer.append(xseq, yseq)

      trainer.set_params({
      'c1': 0.0, # coefficient for L1 penalty
      'c2': 0.0, # coefficient for L2 penalty
      'max_iterations': 10, # stop earlier
      # include transitions that are possible, but not observed
      'feature.possible_transitions': True
      })

      trainer.train('testcrf.crfsuite')
      tagger = pycrfsuite.Tagger()
      tagger.open('testcrf.crfsuite')
      tagger.tag(xseq)
      # ['False', 'True', 'False']


      It seems that I did manage to get it working, but I'm not sure if I've approached it correctly. I'll formulate my questions in the Questions section, but first, here is an alternative approach using keras_contrib package:



      from keras import Sequential
      from keras_contrib.layers import CRF
      from keras_contrib.losses import crf_loss

      # we are gonna have to revisit data prep stage again
      # separate predictors and response
      response_df_dict = {}
      for g in ['a','b']:
      response_df_dict[g] = df[[c for c in df.columns if 'next' in c and g in c]]

      # reformat for LSTM
      # the response for every row is a matrix with depth of 2 (the number of groups) and width = shift_length
      # the predictors are of the same dimensions except the depth is not 2 but the number of predictors that we have

      response_array_list =
      col_prefix = set([re.sub('_d+$','',c) for c in df.columns if 'next' not in c])
      for c in col_prefix:
      current_array = df[[z for z in df.columns if z.startswith(c)]].values
      response_array_list.append(current_array)

      # reshape into samples (1), time stamps (2) and channels/variables (0)
      response_array = np.array([response_df_dict['a'].values,response_df_dict['b'].values])
      response_array = np.reshape(response_array, (response_array.shape[1], response_array.shape[2], response_array.shape[0]))
      predictor_array = np.array(response_array_list)
      predictor_array = np.reshape(predictor_array, (predictor_array.shape[1], predictor_array.shape[2], predictor_array.shape[0]))

      model = Sequential()
      model.add(CRF(2, input_shape=(predictor_array.shape[1],predictor_array.shape[2])))
      model.summary()
      model.compile(loss=crf_loss, optimizer='adam', metrics=['accuracy'])
      model.fit(predictor_array, response_array, epochs=10, batch_size=1)
      model_preds = model.predict(predictor_array) # not gonna worry about train/test split here


      Questions:



      My main question is whether or not I've constructed both of my CRF models correctly. What worries me is that (1) there is not a lot of documentation out there on CRF models, (2) CRFs are mainly used for predicting a single label given a sequence, (3) the input features are nested and (4) when used in a multi-tasked fashion, I'm not sure if it is valid.



      I have a few extra questions as well:




      1. Is a CRF appropriate for this problem?

      2. How are the 2 approaches (one based on pycrfuite and one based on keras_contrib) different and what are their advantages/disadvantages?

      3. In a more general sense, what is the advantage of combining CRF and LSTM models into one (like one discussed here)


      Many thanks!










      share|improve this question
















      this question is an extension of this one which focuses on LSTM as opposed to CRF. Unfortunately, I do not have any experience with CRFs, which is why I'm asking these questions.



      Problem:



      I would like to predict a sequence of binary signal for multiple, non-independent groups. My dataset is moderately small (~1000 records per group), so I would like to try a CRF model here.



      Available data:



      I have a dataset with the following variables:




      1. Timestamps

      2. Group

      3. Binary signal representing activity


      Using this dataset I would like to forecast group_a_activity and group_b_activity which are both 0 or 1.



      Note that the groups are believed to be cross-correlated and additional features can be extracted from timestamps -- for simplicity we can assume that there is only 1 feature we extract from the timestamps.



      What I have so far:



      Here is the data setup that you can reproduce on your own machine.



      # libraries
      import re
      import numpy as np
      import pandas as pd

      data_length = 18 # how long our data series will be
      shift_length = 3 # how long of a sequence do we want

      df = (pd.DataFrame # create a sample dataframe
      .from_records(np.random.randint(2, size=[data_length, 3]))
      .rename(columns={0:'a', 1:'b', 2:'extra'}))
      df.head() # check it out

      # shift (assuming data is sorted already)
      colrange = df.columns
      shift_range = [_ for _ in range(-shift_length, shift_length+1) if _ != 0]
      for c in colrange:
      for s in shift_range:
      if not (c == 'extra' and s > 0):
      charge = 'next' if s > 0 else 'last' # 'next' variables is what we want to predict
      formatted_s = '{0:02d}'.format(abs(s))
      new_var = '{var}_{charge}_{n}'.format(var=c, charge=charge, n=formatted_s)
      df[new_var] = df[c].shift(s)

      # drop unnecessary variables and trim missings generated by the shift operation
      df.dropna(axis=0, inplace=True)
      df.drop(colrange, axis=1, inplace=True)
      df = df.astype(int)
      df.head() # check it out

      # a_last_03 a_last_02 ... extra_last_02 extra_last_01
      # 3 0 1 ... 0 1
      # 4 1 0 ... 0 0
      # 5 0 1 ... 1 0
      # 6 0 0 ... 0 1
      # 7 0 0 ... 1 0
      [5 rows x 15 columns]


      Before we get to the CRF part, I suspect that I cannot use approach this problem from a multi-task learning point of view (predicting patterns for both A and B via one model) and therefore I'm going to have to predict each of them individually.



      Now the CRF part. I've found some relevant example (here is one) but they all tend to predict a single class value based on a prior sequence.



      Here is my attempt at using a CRF here:



      import pycrfsuite

      crf_features = # a container for features
      crf_labels = # a container for response
      # lets focus on group A only for this one
      current_response = [c for c in df.columns if c.startswith('a_next')]
      # predictors are going to have to be nested otherwise I'll run into problems with dimensions
      current_predictors = [c for c in df.columns if not 'next' in c]
      current_predictors = set([re.sub('_d+$','',v) for v in current_predictors])
      for index, row in df.iterrows():
      # not sure if its an effective way to iterate over a DF...
      iter_features =
      for p in current_predictors:
      pred_feature =
      # note that 0/1 values have to be converted into booleans
      for k in range(shift_length):
      iter_pred_feature = p + '_{0:02d}'.format(k+1)
      pred_feature.append(p + "=" + str(bool(row[iter_pred_feature])))
      iter_features.append(pred_feature)
      iter_response = [row[current_response].apply(lambda z: str(bool(z))).tolist()]
      crf_labels.extend(iter_response)
      crf_features.append(iter_features)

      trainer = pycrfsuite.Trainer(verbose=True)
      for xseq, yseq in zip(crf_features, crf_labels):
      trainer.append(xseq, yseq)

      trainer.set_params({
      'c1': 0.0, # coefficient for L1 penalty
      'c2': 0.0, # coefficient for L2 penalty
      'max_iterations': 10, # stop earlier
      # include transitions that are possible, but not observed
      'feature.possible_transitions': True
      })

      trainer.train('testcrf.crfsuite')
      tagger = pycrfsuite.Tagger()
      tagger.open('testcrf.crfsuite')
      tagger.tag(xseq)
      # ['False', 'True', 'False']


      It seems that I did manage to get it working, but I'm not sure if I've approached it correctly. I'll formulate my questions in the Questions section, but first, here is an alternative approach using keras_contrib package:



      from keras import Sequential
      from keras_contrib.layers import CRF
      from keras_contrib.losses import crf_loss

      # we are gonna have to revisit data prep stage again
      # separate predictors and response
      response_df_dict = {}
      for g in ['a','b']:
      response_df_dict[g] = df[[c for c in df.columns if 'next' in c and g in c]]

      # reformat for LSTM
      # the response for every row is a matrix with depth of 2 (the number of groups) and width = shift_length
      # the predictors are of the same dimensions except the depth is not 2 but the number of predictors that we have

      response_array_list =
      col_prefix = set([re.sub('_d+$','',c) for c in df.columns if 'next' not in c])
      for c in col_prefix:
      current_array = df[[z for z in df.columns if z.startswith(c)]].values
      response_array_list.append(current_array)

      # reshape into samples (1), time stamps (2) and channels/variables (0)
      response_array = np.array([response_df_dict['a'].values,response_df_dict['b'].values])
      response_array = np.reshape(response_array, (response_array.shape[1], response_array.shape[2], response_array.shape[0]))
      predictor_array = np.array(response_array_list)
      predictor_array = np.reshape(predictor_array, (predictor_array.shape[1], predictor_array.shape[2], predictor_array.shape[0]))

      model = Sequential()
      model.add(CRF(2, input_shape=(predictor_array.shape[1],predictor_array.shape[2])))
      model.summary()
      model.compile(loss=crf_loss, optimizer='adam', metrics=['accuracy'])
      model.fit(predictor_array, response_array, epochs=10, batch_size=1)
      model_preds = model.predict(predictor_array) # not gonna worry about train/test split here


      Questions:



      My main question is whether or not I've constructed both of my CRF models correctly. What worries me is that (1) there is not a lot of documentation out there on CRF models, (2) CRFs are mainly used for predicting a single label given a sequence, (3) the input features are nested and (4) when used in a multi-tasked fashion, I'm not sure if it is valid.



      I have a few extra questions as well:




      1. Is a CRF appropriate for this problem?

      2. How are the 2 approaches (one based on pycrfuite and one based on keras_contrib) different and what are their advantages/disadvantages?

      3. In a more general sense, what is the advantage of combining CRF and LSTM models into one (like one discussed here)


      Many thanks!







      keras classification crf sequence-to-sequence crfsuite






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 1 at 11:33







      IVR

















      asked Dec 31 '18 at 12:45









      IVRIVR

      367418




      367418
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53987682%2fmultivariate-binary-sequence-prediction-with-crf%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53987682%2fmultivariate-binary-sequence-prediction-with-crf%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas